Digging in the data mine

13 Nov 1997

Be the first to comment

A Computing logo

The key to effective data analysis is making the right information available to the right people at the right time. Nick Burrell, IBM Software's DB2 brand manager, says: 'With the emergence of the Internet, intranet, and extranet, the ability to make information available to a much larger audience is very appealing to the majority of users, who are quite happy to accept standard reports or graphs, provided they're showing something that is relevant to their role.'

But you still have to make sure that users are happy with the data you're providing them with. According to a recent study carried out for Business Objects ('Decision Support 1997') by Aspect International Consulting, 88% of management admits to using 'gut feel' rather than hard fact up to 75% of the time when making business decisions; and 51% of sales and marketing managers also tend to rely on gut feel for at least half the time. Just under two thirds of management staff admit that they don't receive the right amount of information to make a decision, yet 99% have access to a desktop PC.

'Despite widespread use of computers, managers are largely reliant on others for information, leading to information that is either too much, too little or too late to support decision making,' notes Charles Nicholls, Business Objects' UK marketing director.

Collecting operational data and making it available to business users in a form that allows them to analyse it is easily is akin to unearthing the Holy Grail. Not surprisingly, many companies have experienced problems with it through poor planning and unclear expectations.

In recent years, enterprise data has been widely scattered across different hardware platforms, databases, operating systems and business applications. At the same time, the accelerating pace of business has made information a key corporate asset.

'In today's competitive economy, an organisation's success or failure can rest largely on how quickly and readily enterprise data can be transformed into meaningful information for analysis and sound decision making,' says Sterling Software's Sarah Ford.

With the advent of ever more powerful personal computers and intuitive graphical user interfaces, this information has moved a little closer to users. 'Instead of relying on IT as they once did, users are becoming more technically savvy and self sufficient. Increasingly, they are querying databases and creating reports themselves,' says Ford.

Business databases and data warehouses hide strategically valuable knowledge which users cannot understand or analyse with traditional ad hoc query and reporting tools. There are many very capable data-analysis tools on the market designed to help alleviate this problem. However, what is so often overlooked before selecting an analysis tool is the importance of the underlying data management infrastructure in the warehouse.

'For analysis to be truly meaningful, the user must be able to gain an understanding of the business information in the context of its implementation in the warehouse: for example, how the data arrived in the data warehouse and where it came from,' says Sharon Ford, international marketing manager at Prism Solutions.

'Evidence from our customer base strongly suggests that while data analysis tools offer users access to warehouse information, the increasingly complex data responses that they are demanding is making the need for an overall technical and business view of the warehouse ever more critical.'

Simon Tilley, VMark Software's data warehousing marketing manager, agrees. 'It's an obvious, but often neglected, fact that data analysis tools are only as good as the data on which they feed,' he says. Most data analysis involves some pre-analysis stage, which might range from basic extraction routines to complex cleansing and data transformation. Until quite recently, the tools to do this have not been available outside mainframe environments

'Many organisations are opting for a series of packaged software solutions, picking best-of-breed software for vertical functions, and then becoming aware of the hidden data-integration issues this brings up. Even a simple move to new hardware platforms, where data needs to be migrated, can complicate data analysis.'

Data mining is gaining credence as an appropriate technique for organisations that collect huge quantities of data to detect hidden trends and associations in it. The technology available today allows data-mining analysis to be undertaken fairly readily.

'The key to its effectiveness is to use the appropriate data-mining techniques and to understand the data sufficiently in the first place to make the correct interpretations of the results,' says Burrell.

Data-mining applications need very clean data, the source for which should be the data warehouse. This implies that a data warehouse is a necessary prerequisite for a data mining project.

'The data warehouse provides all the infrastructure to get data from the operational systems into a state ready for the data miner,' says SAS Institute's data warehousing product manager Richard Neale. He estimates that some 70% to 80% of the effort of a data-mining project is spent getting the data cleaned and into a format ready for data mining.

'Imagine the productivity gains to be made by using a data warehouse to reduce this overhead,' he adds.

Clearly, with such a close relationship between the data warehouse and data mining tool, very good integration between both is mandatory.

Data mining is not an exact science, but an iterative process. The data miner needs to modify and re-shape the data and try a number of different data mining techniques.

Geographic information systems are similarly being looked on as effective data-analysis systems nowadays, producing information the mainstream business can genuinely use.

'From our experience, the data miner will need a "playpen" where they can try things out. This will need a lot of disk space, possibly 10 to 20 times the size of the original data,' says Neale. 'There is a natural relationship between data mining and other business intelligence-type applications. The results of data mining may provide insights into the data that populates the warehouse. In turn, these insights may drive new business intelligence applications, or even a redesign of the warehouse.'

Who are the users? Tools for the job

Data analysis requirements vary enormously. 'Some of our customers may have a small group of users who manage and analyse the data so that they can distribute the tailor-made information around the enterprise,' says Derek Taylor, vice-president of marketing and sales at Seagate Software. 'Others may want to enable a large number of users to analyse a smaller table of data on an ad hoc basis.'

The user often has a need for both forms of data analysis. 'An Olap tool would be used by a central department, while data analysis may be available more widely across the enterprise,' adds Taylor.

Although basic Olap gives some simple data analysis - such as calculations, variance reporting, traffic lighting - the data-mining industry is founded on the idea that there must be something more that can be done. Today a lot of money is being spent on techniques such as cluster analysis, neural networking and rule induction. 'At the moment, much of this is still in the early stages of acceptance, but as it gains credibility, it is likely that both relational and Olap tools will encompass it,' says Taylor.

According to SAS Institute's Richard Neale, data-mining tools should be judged on their return on investment and ease of use and deployment, rather than on the purity of technology and models or the accuracy of their predictions.

Scottish Homes It's a SmallWorld

Established in 1989, Scottish Homes is a government-funded agency set up to improve the quality of housing Scotland. It creates and monitors housing associations, working with the private sector to encourage the development of housing projects and acting as a public sector landlord responsible for about 30,000 properties.

Scottish Homes has a responsibility both to private investors and to the government. Therefore, it needs to undertake complex analyses to decide where best to spend its money.

Aware that every aspect of housing has a geographical element, Scottish Homes decided to buy SmallWorld's GIS technology. 'All the analysis we undertake has to have a geographical reference - we need to know where a property is,' explains John Taylor, Scottish Homes' IT business manager. 'We did not need a desktop mapping system, but software that would allow us to undertake complex data analysis.'

Since installation in 1992, Scottish Homes has carried out local housing systems analysis (LHSA) to produce an assessment of needs and demands.

'LHSA is fundamental to Scottish Homes' operation,' says Taylor. 'We can understand the housing market and assess information such as owner occupation, travel to work habits and migration patterns.'

Reader comments

Have your say on this article

All fields required. Your email address will not be displayed on the site.

By submitting a comment you agree to abide by our Terms & Conditions

Technology Patent Wars

Large companies such as Microsoft, Facebook and Google have been hoovering up technology patents recently. Is this stifling innovation?

88 %

5 %

7 %