Rise of the machines

Amazon and Microsoft are fast-refining their cloud-based machine learning services, and attracting the attention of companies large and small

Watchfinder.co.uk might only be an SME, by most definitions, but some time in the near future, the fast-growing buyer and seller of luxury watches might be generating quotes not via its in-house panel of experts, but by using a machine-learning application running in the cloud.

"I like the approach that the cloud operators are taking, which is that machine learning just becomes a 'lego brick' in your entire big data strategy," says Watchfinder IT director Jonathan Gill.

Since Microsoft introduced machine learning into its Azure cloud platform last year, Gill has been experimenting with machine learning to see how it could be used at the company.

"It enables us to dump everything into Hadoop, and do all the historical analysis and then, literally, you pull in the machine learning app and just experiment with it. Once you get something you're happy with, you can click a button, create a web service and then bring it into your main data pipeline - it's the whole Lego brick aspect of it," he says.

For Watchfinder, which hosts much of its IT infrastructure in the Amazon cloud, the ease with which machine learning can be bought and integrated when it is running in the cloud - rather than a packaged application - has finally brought the kind of predictive analytics that only major organisations used to be able to afford into the price range of ordinary businesses.

What appeals to Gill about Microsoft's machine learning cloud service is the speed with which it has iterated and improved the service in just the past year, combined with the marketplace it has opened up for third-party algorithms and other tools that can be plugged into the core service.

"The speed with which Microsoft is iterating services on Azure now is ridiculous... But the really cool thing about it is that they have opened up the machine-learning marketplace, so you've got, for example, the Bing team contributing recommendation algorithms for search.

"We are 'dumping' our view data, give it a product and it will recommend items based on what people are viewing - we don't need to do the hard work to create the algorithm because the guys at Bing have already done a pretty good one," says Gill.

Market strategy

Mike Gualtieri, a principal analyst at Forrester Research, describes the marketplace for algorithms and other pre-prepared machine learning tools on both Amazon and Microsoft as "nascent". Indeed, he adds, the algorithm itself is normally just a small part of the overall package required for organisations to build machine learning into their applications.

"There's two different marketplaces: there's a marketplace for built models - to predict customer churn or for a recommendation engine. Those models are built based on the data, so you may need to do some data wrangling to make them work," he says.

He adds: "Is there a marketplace for algorithms out there that data scientists can use? Yes and no. The advantage is that there's so many algorithms already and many of them, if you don't have a predictive model, you can probably create one from those that exist."

The key, though, is building a predictive model that is not only tailored to the needs of the organisation, but which improves accuracy - and all that involves work.

Particularly popular in the Azure Data Market, says John Bronskill, partner architect at Microsoft Research in Cambridge, are recommendation engines. "We have ready-made APIs that any online store can feed in purchase or intent-to-purchase history. Based on what users have clicked on, or rated highly, we can recommend other items that they will either need as accessories or, perhaps, something new that will appeal to them based on what they have purchased in the past," says Bronskill.

Other items in the Azure Data Market include Customer Churn Prediction, Text Analytics, Frequently Bought Together, and Anomaly Detection. However, the marketplace currently offers just 41 items under "machine learning", backing up Gualtieri's judgement.

Advantage Microsoft

For Gualtieri, Gill's preference for Microsoft Azure for machine learning makes sense. "I'd say that Microsoft's is a much more sophisticated tool, at this stage, than Amazon's," says Gualtieri.

The main reason for this, he adds, is Microsoft's early decision to base its cloud machine learning technology on R, a programming language and software environment originally developed to aid statistical computing, written primarily in C, Fortran, and R.

"It has a very visual tool for creating arbitrary analytical workloads. Behind many of those operations is an open-source programming language called R. In contrast, Amazon has some light data preparation tools, but only one class of linear modelling algorithms, so it's much more limited," says Gualtieri.

He continues: "With the Microsoft solution, because R is behind it, you can have a lot more text analytics - there's hundreds more possibilities.

[Please see page two]

Rise of the machines

Amazon and Microsoft are fast-refining their cloud-based machine learning services, and attracting the attention of companies large and small

"The advantage of Amazon is that you don't need to know what you are doing because it's limited to one class of algorithms. So you prepare your data set, you click the target - the thing you want to predict - and it will try to build that model automatically. They are both smart tools, targeting developers.

"But the Microsoft tool could be used by data scientists... Amazon released its tool early, but Amazon always releases its services early. That's its strategy: release early, it doesn't have to be the most sophisticated tool, but then Amazon will continuously update it, especially as they get feedback from customers," he says.

However, while Microsoft's choice of R to power its machine learning cloud service has helped lend it extra sophistication, Gualtieri warns that it may also have implications in terms of scalability, given the heritage of R.

"Just because it's in the cloud doesn't mean that it's highly scalable. One of the downsides of picking R is that it was developed more than 15 years ago. It was designed to run on a single-core, desktop workstation of a scientist, so it's not inherently scalable. Microsoft bought Revolution Analytics [in April 2015], which was started up to try and make R more scalable. What they have done is parallelise many of the algorithms that are in R," says Gualtieri.

However, it's still early days for Microsoft's Revolution Analytics acquisition and it remains to be seen how it helps Microsoft to develop and improve the scalability of its machine learning capabilities.

With Amazon, meanwhile, there is something of a lock-in to the Amazon Web Services (AWS) architecture with the requirement that data sets ought to originate in S3 or RedShift, the company's in-house cloud database services. "The way they have their pricing model is that it depends upon the size of the data. And by using those formulas, they can inherently scale that. For that singular clash of linear modelling algorithms that they have, I would say that it's probably very scalable," says Gualtieri.

However, Ian Massingham, a technical evangelist at Amazon Web Services, claims that it isn't mandatory for users of Amazon's machine learning tools to host their data on AWS databases. "You can provide the data in CSV format to place that data inside S3, and then train your model and delete your data. It doesn't need to be stored in AWS, but you will need to stage it temporarily in S3 in order to load it into the machine learning," says Massingham.

Amazon, he adds, offers tools for both batch and real-time machine learning. You wouldn't generate the model in real-time because that's something that happens in the batch. We offer predictions through two interfaces: batch-based predictions and real-time predictions. For real-time predictions you will call the machine learning API, parse it the data you want to make the prediction on and then the API will give you the prediction back in real time," he says.

"The recommendation engine on the Amazon website is the perfect example of that. But the batch style is still very useful. Say you're going to send out a monthly email to your customers or users. You would use a machine learning batch-style model to determine which users receive what topics in their email - look at what they might be interested in from previous articles they've read on your web site, if your a media company, for example," he adds.

Package versus cloud

Of course, machine learning - or predictive analytics - tools are not new. Indeed, the cloud services offered by both Microsoft and Amazon are very much playing catch-up to packaged software offered by, for example, IBM with Watson Analytics.

However, machine learning in the cloud, as Gill points out, has very much lowered both the price and technology barriers enabling many more organisations to try it out. "The first great argument for doing it in the cloud is that that's where your application data increasingly lives. The other thing that makes sense is particularly for developers, this is a hot area, a hot skill, and in both of those cloud environments, it's very easy to try machine learning, even on smaller data sets," says Gualtieri.

"Also, both Amazon and Microsoft have a way to publish those models, to expose and monetise those models as APIs. They both make it dead simple to do that. Microsoft's is a little superior to Amazon because with Amazon you can only expose that predictive model, whereas Microsoft can expose the whole analytical workflow, including data pipeline, data transformation and include some arbitrary complexity to it."

[Please turn to page three]

Rise of the machines

Amazon and Microsoft are fast-refining their cloud-based machine learning services, and attracting the attention of companies large and small

Packaged software, in contrast, requires a huge upfront commitment in terms of hardware and software licensing fees that have detained machine learning, until now, very much in the data centres of major organisations.

Two words: data science

The next big test for many organisations, therefore, is one of skills. With machine learning taking "data science" mainstream, the demand for data scientists - people skilled not just in the underlying languages, algorithms and architectures of data science, but who are able to relate it to the needs of the business.

It isn't, warns Gualtieri, as simple as Amazon and Microsoft may make it seem: "That is the problem: just because you have a data set doesn't mean you can run machine learning and it will come up with an answer - it often doesn't. Knowledge of what data might actually work is really important and refining these models is quite an iterative process.

"That iterative process has two dimensions to it. One is you get a data set, run an algorithm and it doesn't work, you try another, and another... so you think I need more data. So then you have to hypothesise what data you have that you can add that will help. That can be an art in itself."

For a company the size of Watchfinder, though, the data science element might be provided, instead, by contractors or consultants. Because most of the elements are quite standardised, the contractor just needs to have an understanding of the business and what it is that it is trying to achieve, and the data scientist can (hopefully) hone the algorithms and data models that the organisation needs to use.

"[You have to] give them standard end-points to pull the data out of, and you have to tell them what you want them to look for. They also have to come in and get to understand your business... But once you get through the initial 'requirements capture', like any other IT project, you can just say: 'Here's the standard tools, go nuts'," says Gill.

Around 12 months ago, he adds, "they were quite hard to come by", but the allure of a high-paying, secure career has attracted a growing number of IT pros who, at least on the surface, claim proficiency in data science. "They are popping up all over the place now," he says. The challenge now, adds Gill, is not to find data scientists, but finding good ones.

Bronskill at Microsoft, though, is adamant that the tools in the Azure Data Market don't require any data science knowledge at all - just a little IT expertise to connect the data sources to the APIs.

"There's two main levels of use of Azure Machine Learning," he says. "The easiest to use are the machine learning marketplace APIs. These require zero expertise of machine learning.

"All you need to do is connect your data to these APIs. For instance, the recommendations-engine API. You would need customer data based on, say, their previous purchase history or their previous rating history of various products. You feed that into the service and then when the customer visits the site again, he or she can get customised recommendations.

"You don't have to have any knowledge of machine learning at all. Obviously, you'll need an IT person knowledgeable about how you store customer data and then feed that into our service. But that's an IT skill, not a data science skill. The marketplace requires no expertise," he says.

However, putting together bespoke machine learning solutions will require data science expertise, he adds.

"Azure Machine Learning can ingest data from almost any source... It can take data that already exists on Azure, in particular, but you can also hook it up to external databases. So you bring in your data and there's a tool called Machine Learning Studio, which is geared for use by a data scientist, who will have to have some statistics and machine learning training - typically a degree in that area," says Bronskill.

However, it will also offer tools enabling users to clean up the data, and run modules covering such machine learning techniques as regression learning, classification or clustering.

Over at Amazon, meanwhile, Massingham points out that the company has been doing machine learning for almost two decades now, and therefore has a lot of in-house knowledge built up since then.

"We've been doing this inside Amazon for a long time. The first service we had was a service called 'eyes', which was introduced on the Amazon.com website in the US very shortly after launch, when it was still just a bookseller.

"It used predictive technology to make suggestions about other books customers might be interested in and proactively notified them about new books due for release that might be of interest to them - so we've been doing this for a long, long time," says Massingham.

That, of course, is a long way from the kind of automated pricing that Gill at Watchfinder is examining, enabled by the kind of pre-packaged components that companies like Amazon and Microsoft are providing in the cloud with much lower barriers to entry then packaged software provide.

What is machine learning?

Machine learning is probably better known as predictive analytics. While it's been available as (expensive) packaged software of increasing sophistication for some time, only now is it becoming more widely available due to the machine learning cloud services that Microsoft and Amazon in particular are providing.

This is how Ian Massingham, a technical evangelist at Amazon Web Services, defines machine learning: "Machine learning automatically finds patterns in existing data and uses those patterns to make accurate predictions on new data provided to the system.

"The way it works is that an algorithm is used to assess your data, which will result in the creation of a mathematical model, which models the patterns present in that data. You can then provide new data sets that might have certain attributes missing and ask the machine learning system to make a prediction of what the missing attributes are going to be using probability theory to do this.

"And that prediction will be based on the patterns that have been observed inside the pre-existing dataset. This process of creating the model is often referred to as training the model, where you will provide complete example records and then ask the system to make predictions."

Probably the one of the most widely used deployments of machine learning today is in recommendation engines, which provide recommendations to online shoppers based on their known spending habits with the commerce website they're perusing.

It isn't just online that such recommendation engines are used: users of Tesco's Clubcard and other loyalty schemes are also targets for offers based on their past purchasers - although frequently operators of such schemes will throw in completely wrong offers in order to re-assure customers that they're not being spied upon...