Putting machine data to work

There are numerous uses for machine data once you've worked out how to process it, says Matt Davies of Splunk

Machine-generated time-series streams of ones and zeros is one of the most rapidly growing categories of data. Unstructured and fast moving, it is also the hardest type of data from which to extract value. The imbalance between quantity and utility is set to become even more pronounced as machine-to-machine communications proliferate with the arrival of the Internet of Things (IoT). Nevertheless, machine data can be a gold mine of useful information, if only you can work out how to get to it.

That was the message of Matt Davies, head of marketing EMEA at analytics provider Splunk, who ran through a range of use cases for machine data during a presentation at the Computing Big Data Summit 2016 yesterday.

"The data we know about is the data we see," he said. "Everyone's generating masses of machine data that they're not using, but that data actually has many uses. It can be used for security, for IT operations, for customer experience, in IoT analytics and also to find out how a product is doing."

IT operations

The practice of turning large volumes of messy machine data into something useful is part of a process known as operational intelligence. An obvious use case for this is to provide a real-time view of what is going on across IT systems. So, for example, machine data can be used to increase visibility, detect anomalies and pin down the root causes of problems with networking and storage infrastructure and cloud services. It can also be used to gain visibility into the application stack, as with Tesco.com, which deployed Spunk to ingest and analyse logs from operating systems and applications in order to improve the performance of its e-commerce site and deal with problems more quickly.

In another example, the bank Credit Suisse analysed data from trading systems to find out who was using its grid computing resources and managed to realise substantial savings by getting this under control.

Security

Anomaly detection is key to anti-fraud interventions. Behaviour analytics is deployed to identify suspicious activity in real-time, and this depends on machine data. Firewall logs may be combined with custom monitoring solutions to raise an alert.

Machine data is used at security vendors in a similar way. Davies mentioned Sophos's Security Operations Centre, which integrates Splunk to gain real-time security intelligence. He also namechecked online retailer Net-a-Porter, which analyses machine data to identify threats from both outside and inside the business.

Customer experience

Sorting out IT and security issues will tend to lead to better customer experience and also promote operational efficiencies in other areas.

"John Lewis started off trying to fix problems with dropped sales when transactions never came back from the payment provider, and then moved on to real-time checkout analytics," Davies said. "That also gave the marketing department a good idea of what action to take based on machine data. So don't discount that sofa because we have 400 of them in people's shopping baskets right now."

IoT analytics

Another obvious application of machine data is in the automotive sector which is increasingly software-driven, with sensors attached to every major component.

"VW have trialed electric cars. They can get all the data from those cars and plot a heat map showing battery life, when the doors were open, when the lights were on. You can see on a map where the speed cameras are because everyone slows down. And you can see the heart rate of the driver who saw the camera first go up as he got flashed by the camera."

In another case Davies mentioned a rail equipment supplier New York Air Brake that was able to use sensor data to identify potential savings totalling $1bn on fuel and other efficiencies across the US railroad network.

Product analytics

The music discovery app Shazam uses machine data to improve understanding of the effectiveness of TV campaigns.

"It uses machine data to say people are using Shazam to check a song on the ad break for a particular car," said Davies.

It also uses the data for A/B product testing.

"If we put a new feature in the top left, how does that compare with putting it in the the top right?"

Lots of companies have now reached the stage where they are using machine data for operational visibility and real time insight, Davies said.

Computing's Internet of Things Business 2016 summit takes place in May. Attendance is free for most delegates, but places are going fast so register today