Does your business need data analytics or Big Data analytics?

clock • 3 min read

Hadoop is powerful, but don't make a mountain out of a molehill when it comes to data

Data is being gathered everywhere on the web nowadays. For example, when you submit your personal information on a website while buying any product online, that website has already collected data from you in the form of your name, email, phone number and address.

If that website is so popular that it is able to sell a product almost every second or even every minute then the data (personal information of customers) it collects is in high volume, velocity and variety. Such data is often termed as Big Data.

Other examples could be railways and flights, where tickets are booked online almost every second. As a result, railway and flight booking systems are collecting data digitally at a very fast pace and in enormous quantity; that is what makes 'Big Data' different from normal data. The major difference is the Five Vs, as mentioned here in more detail.

Normal data is collected at a very slow pace over a long period of time, and so is easy to manage in different formats like spreadsheets, MySQL databases, etc. This is not usually the case with Big Data, though, as it is often terabytes in size and so difficult to handle and process using traditional applications/tools.

Hadoop is the traditional database management system for storing and processing Big Data. A vast amount of raw data is stored in HDFS, the major component of Hadoop, but aggregated/summarised data is sent to MySQL for analysis.

But does your business really need Hadoop for data analytics?

Many businesses don't really 'need' Hadoop, unless they are actually dealing with Big Data. If your inflow of data is slow then a MySQL database can easily do the job. When you buy a web hosting package for your business website, whether shared or a dedicated server, you already get the database - you can access it using PHPMYADMIN in the control panel of your web hosting.

It is possible to hire a PHP (a programming language) programmer who can develop scripts to store your data in a MySQL database and then perform the data analytics on it, as per your company's requirements. Data analytics is nothing but analysing the data as desired and then sorting it to get the benefit out of it in some way.

Suppose you are running an e-commerce website to sell your products online. Let us assume that you receive four to five orders every day, on average. Since you are receiving a very low number of daily orders, your inflow of data, in the form of customers' information, will also be at a slow pace, thus not requiring Hadoop; a simple MySQL database will do the job.

Now, if you have been collecting data at this pace for past two years and you need to perform analytics on it, then it will still be data analytics and not Big Data analytics, as the MySQL queries can still work for such small data.

Hadoop is only required when MySQL queries don't work to analyse the tremendous amount of data collected over the years (say five to 10 years), or the inflow of data is at a fast pace. In that case, you need to switch to Big Data analytics.

Suppose the query is to find the product which is receiving the maximum number of orders from a particular city. If you need to perform this query on terabytes of data, then MySQL may not be able to perform the query and in that case, you will need the help of the more advanced system, Hadoop, which is built for Big Data analytics.

If you are at the starting phase of your business you should first consider MySQL database for your data analytics needs; as you progress, and when the time comes when MySQL is unable to handle your queries or inflow of data, then you can make a decision to switch to Big Data system.

You may also like
China unveils pioneering undersea datacentre

Datacentre

The facility is projected to save approximately 122 million kWh of electric power

clock 04 December 2023 • 2 min read
Protecting Wimbledon: 'Data is at the heart of everything we do'

Big Data and Analytics

Whether in sport or security, it’s all about data

clock 18 July 2023 • 3 min read
Wimbledon 2023: Tennis whites, Pimm's and IBM AI

Big Data and Analytics

AI is finding new data and new insights – but pronouncing ‘Djokovic’ has been a struggle

clock 17 July 2023 • 4 min read

More on Big Data and Analytics

Even CERN has to queue for GPUs. Here's how they optimise what they have

Even CERN has to queue for GPUs. Here's how they optimise what they have

'There's a tendency to say that all ML workloads need a GPU, but for inference you probably don't need them'

John Leonard
clock 17 April 2024 • 4 min read
Partner Content: Why good data is the foundation of AI success

Partner Content: Why good data is the foundation of AI success

Does your organisation have the right quantity and quality of data to make its AI ambitions a reality?

Arrow
clock 04 April 2024 • 2 min read
Partner Content: Human-in-the-loop - How AI can boost your organisational culture

Partner Content: Human-in-the-loop - How AI can boost your organisational culture

Why it’s vital to consider your organisation’s people when implementing AI

Arrow
clock 26 March 2024 • 2 min read