MicroStrategy World 2011: Facebook and eBay team up to crack 'big data' problems

Facebook CIO Tim Campos wants two firms to get their BI teams together to discuss mutual challenges

Executives from Facebook and eBay plan to meet to discuss mutual challenges posed by big data, Computing has learned at MicroStrategy World 2011.

Following a presentation by Kiril Evtimov, senior manager of virtualisation and self service platform at eBay, Facebook CIO Tim Campos approached the auction site executive to discuss the subject of big data.

The Facebook CIO told Evtimov that the social networking site could relate to eBay's big data challenges with regard to being able to "scale up" sufficiently and ensure that the "data pipes" were large enough to accommodate the data.

At the end of their chat, Campos exchanged business cards with Evtimov and suggested that the two internet giants' business intelligence teams should meet to discuss common problems around big data.

Quocirca analyst Clive Longbottom suggested that Facebook had most to gain from the meeting.

"Facebook has the bigger problem, due to the fact that its data is a mix of text, graphics and other things like geospatial data. This is difficult to deal with at an analytical level, as you cannot directly analyse two sentences in the same way," said Longbottom.

"EBay has the same issues when it comes to comments, questions and feedback, but most of its data is relatively standard, based on rows and columns," he added.

"Taking relatively standard approaches for eBay should work, for example, Hadoop or Oracle Exadata. For Facebook, something more like IBM Watson [an artificial intelligence computer system] may be required, as this would allow inspection and interrogation of data ad hoc across multiple data types and sources and the analysis of it against different sets of criteria."

Longbottom argued that pipe bandwidth for either company would only be a problem if the analytics architecture is wrong.

"A query should be created centrally and the workload farmed out to the data sources required. The main work should be done at LAN speed at the data source, and then partial results should be sent back to the central point where a final analysis would be carried out," he said.

"As a result, there should only be a problem if the partial analysis is too weak, and consequently the partial results are large data volumes."

In his presentation, Evtimov described to delegates that in 1998, three years after eBay was founded, it was a "dark time" for analytics. The business was supported with analytics in the form of paper reports that were delivered using custom C++ code run against eBay transactional databases. The auction site had 10 business intelligence users and the report turnaround was 15 days.

Ebay now has a dual active system that is available 24 hours a day, 7 days a week, 365 days a year. Evtimov said that the system is always online, has over a million queries a day, which equates to over 100 petabytes of data being processed.

"At eBay, analytics is embedded in our lives. We measure everything. We live and breathe data," said Evtimov.

"You need to force the analytical mindset on your organisation, and ensure that you find a way to deliver business agility value, through analytics, from day one. Nobody will wait a year for you to build a system, and even if what you have isn't at its full potential, you need to draw a big picture so that people understand where you are going with BI," he advised.

"You also need to invest in technologies that scale. You need to think about your technology stack carefully and always think about scalability".