At EMC World 2011 yesterday EMC said that it plans to start using open source technology within its business intelligence offering.
Greenplum, a data analytics and warehousing company that EMC acquired in July 2010, is set to use Apache Hadoop open source software to provide a co-processing appliance that will, according to EMC, allow enterprises to seamlessly analyse "big data".
Hadoop is an open source technology used mainly by big internet companies, such as Facebook and Yahoo. It provides a framework to support data-intensive distributed applications and is used for analysing and storing massive amounts of data.
"It is estimated that enterprise data growth will increase by 650 per cent over the next five years," said Scott Yara, co-founder of Greenplum and vice president of products, EMC Data Computing Division.
"The key aspect is that a huge amount of that data growth is going to be around unstructured data – approximately 80 per cent of it," he added.
"What we are seeing in the unstructured data world, is that the Hadoop platform is becoming an important solution to solving these unstructured data processing problems."
EMC Greenplum HD will be available in Community and Enterprise editions. The Community edition will be 100 per cent open source certified, supporting HDFS, MapReduce, Zookeeper, Hive and HBase platforms.
The Enterprise version, meanwhile, will be 100 per cent interface-compatible with the Apache Hadoop stack.
This compatibility is set to allow large organisations to carry out data management as well as load and access data using a native network file system interface, the firm said.
The Greenplum Community, Enterprise edition and the EMC Greenplum HD Data Computing Appliance are expected to be available in the third quarter of 2011. Pricing is yet to be confirmed.
By eliminating high entry costs for big data analysis, you can convert more raw data into valuable business insight.
A discussion of the "risk perception gap", its implications and how it can be closed