Does your business need data analytics or Big Data analytics?

clock • 3 min read

Hadoop is powerful, but don't make a mountain out of a molehill when it comes to data

Data is being gathered everywhere on the web nowadays. For example, when you submit your personal information on a website while buying any product online, that website has already collected data from you in the form of your name, email, phone number and address.

If that website is so popular that it is able to sell a product almost every second or even every minute then the data (personal information of customers) it collects is in high volume, velocity and variety. Such data is often termed as Big Data.

Other examples could be railways and flights, where tickets are booked online almost every second. As a result, railway and flight booking systems are collecting data digitally at a very fast pace and in enormous quantity; that is what makes 'Big Data' different from normal data. The major difference is the Five Vs, as mentioned here in more detail.

Normal data is collected at a very slow pace over a long period of time, and so is easy to manage in different formats like spreadsheets, MySQL databases, etc. This is not usually the case with Big Data, though, as it is often terabytes in size and so difficult to handle and process using traditional applications/tools.

Hadoop is the traditional database management system for storing and processing Big Data. A vast amount of raw data is stored in HDFS, the major component of Hadoop, but aggregated/summarised data is sent to MySQL for analysis.

But does your business really need Hadoop for data analytics?

Many businesses don't really 'need' Hadoop, unless they are actually dealing with Big Data. If your inflow of data is slow then a MySQL database can easily do the job. When you buy a web hosting package for your business website, whether shared or a dedicated server, you already get the database - you can access it using PHPMYADMIN in the control panel of your web hosting.

It is possible to hire a PHP (a programming language) programmer who can develop scripts to store your data in a MySQL database and then perform the data analytics on it, as per your company's requirements. Data analytics is nothing but analysing the data as desired and then sorting it to get the benefit out of it in some way.

Suppose you are running an e-commerce website to sell your products online. Let us assume that you receive four to five orders every day, on average. Since you are receiving a very low number of daily orders, your inflow of data, in the form of customers' information, will also be at a slow pace, thus not requiring Hadoop; a simple MySQL database will do the job.

Now, if you have been collecting data at this pace for past two years and you need to perform analytics on it, then it will still be data analytics and not Big Data analytics, as the MySQL queries can still work for such small data.

Hadoop is only required when MySQL queries don't work to analyse the tremendous amount of data collected over the years (say five to 10 years), or the inflow of data is at a fast pace. In that case, you need to switch to Big Data analytics.

Suppose the query is to find the product which is receiving the maximum number of orders from a particular city. If you need to perform this query on terabytes of data, then MySQL may not be able to perform the query and in that case, you will need the help of the more advanced system, Hadoop, which is built for Big Data analytics.

If you are at the starting phase of your business you should first consider MySQL database for your data analytics needs; as you progress, and when the time comes when MySQL is unable to handle your queries or inflow of data, then you can make a decision to switch to Big Data system.

You may also like
A matter of scale: How this World Heritage site is getting a handle on big data

Big Data and Analytics

'In two years it will be 45 million rows, easily'

clock 19 June 2024 • 4 min read
Blenheim Estate: How tech is protecting 'the finest view in England'

Big Data and Analytics

Data analysis and a sprawling sensor network are saving money and boosting biodiversity

clock 12 June 2024 • 5 min read
China unveils pioneering undersea datacentre

Datacentre

The facility is projected to save approximately 122 million kWh of electric power

clock 04 December 2023 • 2 min read

Sign up to our newsletter

The best news, stories, features and photos from the day in one perfectly formed email.

More on Big Data and Analytics

Industry Voice: How tech investment is improving efficiency at Mitie

Industry Voice: How tech investment is improving efficiency at Mitie

A single source of truth underpinning everything

Shaun Carroll
clock 19 June 2024 • 1 min read
A matter of scale: How this World Heritage site is getting a handle on big data

A matter of scale: How this World Heritage site is getting a handle on big data

'In two years it will be 45 million rows, easily'

Tom Allen
clock 19 June 2024 • 4 min read
Blenheim Estate: How tech is protecting 'the finest view in England'

Blenheim Estate: How tech is protecting 'the finest view in England'

Data analysis and a sprawling sensor network are saving money and boosting biodiversity

Tom Allen
clock 12 June 2024 • 5 min read