Big data and IoT: what's next?
The industry responds to Computing's findings on IoT pilots, data veracity and NLP
A number of interesting findings arose from this year's Computing research into how big data and the IoT are being adopted, including worries about the veracity of data, areas of progress with IoT projects and the growing importance of natural language processing. We asked a couple of industry figures to comment.
The volume of data that firms need to process - or at least feel they ought to - was still the most prominent of the four "Vs" of concern to our survey respondents, despite the advent of cheap cloud based storage and real-time in-memory processing. Martin James, regional vice president Northern Europe at big data firm DataStax, believes this is because of the number of new devices coming on stream, as well as paralysis caused by not knowing the best way to proceed.
"I think that the volume of data created by IoT devices is a much more serious issue for enterprises today," he said.
"Pilot projects have started to deliver value and enterprises want to scale them up, and the actual real-world potential is more apparent than it was previously. There's also a recognition that the sheer amount of information can become overwhelming if you are not careful, and that this can make it difficult to achieve the results that you are after."
Download the Computing Big Data & IoT Review 2017
Computing's research found that more than half of those polled were actively preparing for IoT projects of one sort or another, but that most of these projects were at an operational rather than an analytical level. However, in certain sectors there is more of an imperative to take it further and faster, argued Southard Jones, vice president product strategy at cloud analytics firm Birst, who offered the example of supply chain companies viewed through the lens of Brexit.
"The economic landscape will be changing dramatically with the Brexit decision and all the planning that companies will have do to across their supply chains. CIPS research shows that there are already many substitutions taking place where European companies are looking to replace UK suppliers with local replacements," he said.
Jones insists that rather than joining a race to the bottom based on price, a focus on sensor-driven analytics could give British suppliers a way to add value.
"For companies that want to avoid replacement, providing IoT analytics could be a significant differentiator and enable them to avoid competing on price alone, particularly if trade tariffs increase. IoT and analytics on data will be essential to enabling existing customers to get more value from their assets, particularly if they are mobile. IoT and analytics on data will be essential to enabling existing customers to get more value from their assets, particularly if they are mobile."
Concerns over the veracity of data, particularly that emanating from IoT devices, was second only to those about its volume. The problem of inconsistent quality of data flowing from devices can be exacerbated by variability in the way it is processed and interpreted, said Birst's Jones.
"Alongside getting accurate data in the first place, it is important that everyone uses consistent metrics and attributes," he said.
"Getting everyone on to the same data definitions - and essentially seeing things in the same way - can help prevent arguments about whose data is correct, and put the emphasis on what the data is telling you."
Agreeing on the common ground is essential if important decisions are to be made on data. Computing's research found that most organisations have a long way to go in this regard.
"One of the biggest hurdles around getting data used for decision-making is how many people can make use of it," Jones said.
Not everyone is a data scientist capable of programming Hadoop queries
"Not everyone is a data scientist capable of programming Hadoop queries. Instead, getting data prepared and structured automatically can save time and get more people actually using the data."
Jones continued: "Machine learning has a role to play here. Getting your systems trained on how to bring in data and give it the right meaning can help people start using data to support their thinking or challenge their assumptions."
The next step might be to allow machine learning to start making decisions about how to move forward, or to suggest additional datasets in order to increase the chance of success, said Jones, adding: "However, this depends on getting all of the right data sets together and networking them to avoid silos of analysis."
Big data and IoT: what's next?
The industry responds to Computing's findings on IoT pilots, data veracity and NLP
IoT drivers
The fact that most IoT projects are still at the operational rather than the analytical stage is focusing attention on the present rather than the future.
"It means that there will be a more material impact from decisions taken around how systems are designed, how data models are constructed and how the results are used," DataStax's James said.
IoT projects normally start around opportunities to automate processes or gather data more efficiently
"IoT projects normally start around opportunities to automate processes or gather data more efficiently. This helps companies see data, use it to reduce costs, and therefore provides a bottom line benefit through improving profitability.
The next step is use this data to think bigger, James went on, using analytics to look for new opportunities. "This is more of a top line benefit in that it is about growing the business."
The differences between these goals have important implications for IT teams he went on. While larger and more complex, these top-line benefits are where CIOs ought to be focused. "CIOs have the best opportunities to have a huge impact on the future of business when it comes to data."
Scaling ambitions
Most operations do not scale linearly. Double the number of sources and processing and analysing the data might increase in complexity by four times, eight times or maybe even more. The increasing importance of real-time data just adds to the scaling problem, James said.
"With so much value tied up in data, keeping systems running is seen as a key element in new services. Designing for fully distributed, always-on IT infrastructure becomes more of a challenge."
A lack of skilled individuals was also thought to be holding many projects back. Birst's Jones sees one solution as outsourcing some aspects to third parties - a sort of hybrid set up - creating space for employees to get trained in newer technologies and skills.
"The hybrid model around cloud can mean getting support for day-to-day tasks while internal staff can use their time on how applications will support the business. What's important here is the focus on what really makes a difference around data - internal staff should know your business and your objectives better than external providers, so help them concentrate on those requirements."
Less type more talk
Of all the data-driven developments under way, it was natural language processing (NLP) that really caught our research respondents' eye. With this technology's rising profile in the form of Amazon Alexa, Google Home and commercial chatbots, many believe that voice will soon be the default interface between man and machine.
"I think NLP will see much more growth in business environments - this helps customers interact with businesses on their own terms" said James.
The real-time data flows and analytics capabilities required by NLP places a lot of pressure on back end systems.
"Getting data management in place to cope with all this information will be important, as NLP performance depends on getting the right data from services and human interaction to improve," James added.
"Utilising elements like machine learning or language libraries based on open source components can help speed up deployment too.
"What matters here is how companies transition from trial or pilot phase into full production environments. This is where support and experience can come into their own, both for internal staff and for the service providers involved."