BIG DATA IS THE FUTURE
- Oct 11, 2018
- 5 min read
In this age of information technology, each and every sphere of our lives is being recorded. As such massive quantities of data are being generated. According to a report by IBM, every single day more than 2 quintillion bytes of data are generated and 90% of data generation has occurred in the last two years (Morton , et al., 2014). Facebook alone has 300 Petabytes of data storage with 200 million photo uploads and more than 4 billion likes (Traverso, 2013). Twitter handles more than 5000 tweets every second (ᴉɟɟɐɹ, 2013).

This kind of data is too fast as well as ambiguous for the traditional databases to store, which are meant to store the structured data. Besides the current databases enforce the ACID (Atomicity, consistency, isolation and durability) on transactions, which can be limiting given the way online shopping trend has increased. This limitation calls for the new set of technologies, ideology and the mindset, which can help in storing different forms of data and process and analyse at the rate at which the current computing trends are evolving (Madden, 2013). These new set of technologies are possible with the help of Big Data movement.
What is Big Data?
The origin of Big Data does not go back to a particular person or organisation and as such a very precise definition of Big Data cannot be put forward. So every vendor gives their own definition of Big Data (Giles, et al., 2013). Big Data may be defined as the data which can’t be processed using the traditional methodologies or the traditional databases. The term, ‘’Big Data’’ may be confusing to some. While as the term implies, the data, which is massive in size, however, velocity and variety of the data are equally important in defining the big data (Jorgensen, et al., 2014).
Evolution of Big Data
The rapid growth of the data is a very recent phenomena. Until two decades back less than 25% of the data was in the digital form leaving more than 75% to be stored on the sheets of paper as well as the other analog mediums such as films. However there has been a very enormous shift ever since. Today less than 3% of the data available on the planet is available in the non-digital format (Giles, et al., 2013).
This huge amount of data needed to be stored somewhere. This gave rise to the Data warehousing, which gets the data essentially from the traditional databases and stores that data in a very clean and a refined form in the warehouse. This data is analysed with the help of data mining (Morton , et al., 2014). However, of late, organisations have realised that there is a tremendous amount of data in form of videos, audios, chats, customer support queries and the list goes on and on, which is being ignored. And this data is but a chance to explore the human behaviour, human society, public interests, public affairs etc. from a perspective which was never possible in the human history (Oracle, 2014).
With Data warehouses such a diversity of data storage is not possible because their main source of data extraction is the Relational Databases. And they are not meant to store the unstructured data. Also with the existing mining methods applying the queries on such data is very unrealistic. This led to the evolution of a new set of technologies which is known as Big Data (Madden, 2013).
Characteristics of Big Data
Volume:
Volume is the most obvious characteristic of the Big Data. The storage requirements of the data are expanding at an ever increasing rate. More than 800 exabytes of data was stored all over the world a decade ago. However this number is expected to soar to 35 zettabytes by the year 2020 (Eaton, et al., 2012).
Variety:
It can be argued that the massive size of Big Data is due to this characteristic. Variety means dealing with the data which may go well beyond the realms of the data stored by the traditional databases. This can include the likes and posts on Facebook, videos uploaded or watched on YouTube, images uploaded on flicker, the google search engines, the biometric records, audios and any other form of data which do not fall under structured data(Giles, et al., 2013).
An example to demonstrate could be cited. On 4th October on the eve of first presidential meeting of Obama more than 10 million tweets were generated in two hours. It was also seen that during the health and insurance the tweets reached their peak ( Wu, et al., 2014).
Velocity:
On August 3 2013 while watching an event in Japan users reached a peak of 143199 tweets per second. And the users did not experience a glitch for even one second even though the load was 500 times more than the normal(ᴉɟɟɐɹ, 2013). This kind of velocity is not possible for the traditional databases as they first import the data before the queries could be applied on them. However for the streaming data technologies like Hadoop can be the answer.
Is Big Data end to traditional Databases?
For now it does not seem to be the case. The data inside the warehouse goes through various refinement procedures before being stored in there. As such this data is of very high quality. Now if the data demands accuracy which might be used in the reports and publications, big data may not be ideal as there are very few refinements that occur in it. Big Data can complement existing technologies so that both go together in creating an effective data management system for many years to come.
Future Work
The future is certainly with Big Data given the existing trends of data generation and the human interaction with the data. The expansion of the Big Data technologies is expected into the cloud computing. Big data can also find its way into the predictive analysis. Also the merger of SQL and Hadoop can help in complementing the technologies of each other to provide a much better data management system.(Morton , et al., 2014).
References:
Adolph, M., 2013. Big Data: Big today, normal tomorrow, s.l.: ITU-T Technology Watch.
Wu, X., Zhu, X. & Ding, W., 2014. Data Mining with Big Data. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 26(1), pp. 97-105.
Eaton, C., Deroos, D., Deutsch, T. & Zikopoulos, P., 2012. Understanding Big Data. New York: McGraw-Hill.
Falsafi, B. & Grot, B., 2014. [Online] Available at: http://ieeexplore.ieee.org.ezproxy.staffs.ac.uk/stamp/stamp.jsp?tp=&arnumber=6871717 [Accessed 25 April 2015].
Giles, J., Corrigan, D., Parasuraman, K. & Deroos, D., 2013. Harness the Power of Big Data. New York: McGraw-Hills.
ᴉɟɟɐɹ, u., 2013. New Tweets per second record, and how!. [Online] Available at: https://blog.twitter.com/2013/new-tweets-per-second-record-and-how
[Accessed 02 May 2015].
Jorgensen, A. et al., 2014. Microsoft Big Data Solutions. s.l.:John Wiley & Sons, Incorporated.
Madden, S., 2013. From Databases to Big Data. [Online] Available at: http://cloudcomputing.ieee.org/csdl/mags/ic/2012/03/mic2012030004.pdf [Accessed 5 May 2015].
Morton , J., Runciman, B. & Gordon, K., 2014. Big Data : Opportunities and challenges. s.l.:BCS Learning & Development Limited.
Oracle, 2014. Oracle Database 12c for Data Warehousing and Big Data, s.l.: Oracle.
Traverso, M., 2013. Presto: Interacting with petabytes of data at Facebook. [Online] Available at: https://www.facebook.com/notes/facebook-engineering/presto-interacting-with-petabytes-of-data-at-facebook/10151786197628920 [Accessed 02 May 2015].













Comments