With large volumes of information created daily, global internet retailer Amazon.com needs an infrastructure and systems in place capable of storing and analysing extremely large volumes of data.
Given the company’s growth and increasing analytical sophistication, Amazon anticipated that inherent limitations in its deep analysis clickstream database system could prevent scaling this part of its information warehouse environment.
It decided to replace its Oracle clickstream database with a Netezza Performance Server (NPS) system that now houses more than 25 terabytes of clickstream and other transactional data.
The result is significantly reduced analysis time allowing greater routine examination, to be performed against 15 months worth of data rather than weekly data sets. More than 25 hours of processing has been reduced to a single hour working with five times the original volume of data.
Improved analytical capabilities have already had a significant impact on the business. ‘Our analysts were concerned that a feature on the web site was not functioning properly,’ says Diane Lye, director of data mining and BI content. ‘To verify the hypothesis, we needed to analyse the clickstream data spanning several weeks. Previously it might have been considered too costly even to attempt.’
The performance and speed advantages make it easier to perform new analyses that may have been too cumbersome using the previous system. Amazon now has the ability to capture even finer detail associated with page view – including content and placement details – and is more closely analysing the impact of different web treatments of a single page on customer behaviour. The potential of the analysis is to provide new insights that will allow the business to create a more effective web experience for its customers.
The new data warehouse appliance has also eliminated time-consuming administrative and maintenance tasks. Jeff Parker, data warehouse infrastructure manager, says that a small portion of one database administrator’s time is needed for the NPS system, rather than the four full-time administrators that are needed to maintain Amazon’s other data warehouse systems.
‘Administrative workload is minimal with the NPS system,’ he says. ‘On the core data warehouse we find ourselves spending time extending table spaces, creating partitions and rebuilding indexes. With the NPS system we just create the
table and it manages the rest for us. We have our largest dataset on the NPS system, and it runs so smoothly, you almost forget that it is there.’
Facilities outside the City are in high demand as companies investigate the benefits of moving their datacentres 17 Jul 2008Advertising Marketplace
- Enterprise Accounting Solutions
- Business Intelligence Solutions
- Enterprise Content Management (ECM)
- Supply Chain Management
- Enterprise Resource Planning (ERP)
- Project Management Solutions
- Customer Relationship Management (CRM)
- Security Solutions
- Systems Management
- Networking and Communications Solutions






