What is Big Data?


The advent of digital technology and its adoption by every sphere of life results into generation of tremendous amount of digital data. This data could be generated as part of the system's business requirements and/or as a system by-product such as logs. In any case, captured data is deemed to have significance in one or the other business use cases. At the same time, economical costing of computing devices such as PC, mobiles, etc. and the reach of internet to every nook and corner of the world has allowed mass utilization of these systems. Thus this high rate of adoption and mass usage of these systems produces a huge amount of data which could be measured in Terabyte (TB), Petabyte (PB) and even Exabyte (EB) for each one of these systems.
Figure below show how the data storage has grown significantly, shifting markedly from analog to digital after 2000 [López11]These numbers are from 2007 and for sure by now the analog space has squeezed more.

Consider social applications such as Facebook and Twitter. These applications are used by millions of users in order to engage with each other either socially or professionally.
For instance, according to Wikipedia [WikiFB14] Facebook which has more almost one and half billion users share around 5 billion content items per day which includes status updates, wallpapers, comments, photos and videos. This leads to more than 500TB of data per day for Facebook to handle. Twitter on the other hand with more than 250M users, share 400M tweets per day [WikiTW13]. Handling the real time tweets and mining for trending topics (hashtags) at such an exceptional rate does require some kind of innovation in technology.

Tremendous amount of data is also getting generated by the sensors attached to almost every measurable thing in the world , be it temperature, wind speed, ocean movements, heart beats or variation in your body mass index. Penetration of the smart phones in our daily life has made it possible to record everything recordable such as videos, daily exercise data, location logs, reading habits, buying trends, etc. Internet of things (IoT) using M2M technology is adding to this already crowded space of data producers.
Traditional businesses have also adopted technologies to increase their productivity. These companies churn out burgeoning volume of transactional data, capturing trillions of bytes of information about their customers, suppliers, and operations. 
Finally, the last but not the least, the continuous digitalization of information which otherwise was available so far in the form of physical documents or books and e-governance initiatives by various countries are too building a database of petabyte in size. US library of Congress collected 235 terabytes of data by 2011 and 15 out of 17 sectors in the United States have more data stored per company than the US Library of Congress [WikibonBDS12].
It is found that the 90% of all the data ever created was created in last 2 years. Estimates show that world's data is going to double every two years [EMC11].
So, how are the technologies going to handle the tremendous explosion of data in the next few years? Traditional technologies that are used to manage the data had never envisioned of managing and processing such an enormous volume of data. This has resulted in a paradigm shift in data management and processing systems. Relational models of data modelling have given way to flat data models. Vertical scaling of infrastructure has fall short of capacity to process this volume of data so horizontal scaling has become the new architectural concern. In software design, data parallelism is being preferred over task parallelism. All this has led to a new paradigm called Big Data paradigm which is efficient processing of extensive datasets by distributing data systems horizontally.

40 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. Thank you for sharing this useful information. I got more information in this blogs comment. Your site was awesome. keep update on some more tutorials…..
    Big Data Hadoop Training in Tnagar
    Big Data Hadoop Training in Nungambakkam
    Big Data Hadoop Training in Saidapet
    Big Data Hadoop Training in Amjikarai
    Big Data Hadoop Training in Vadapalani

    ReplyDelete
  3. Great post! This is very useful for me and gain more information, Thanks for sharing with us.

    Guest posting sites

    Technology

    ReplyDelete
  4. Thanks for your wonderful post. It is really very helpful for us and I have gathered some important information from this blog.
    CCNA Training in Tambaram
    CCNA Training in chennai Velachery
    CCNA Training in Tnagar
    CCNA Training in Nungambakkam
    CCNA Course in Kodambakkam

    ReplyDelete
  5. thank you for sharing such a nice and interesting blog with us. i have seen that all will say the same thing repeatedly. But in your blog, I had a chance to get some useful and unique information. I would like to suggest your blog in my dude circle. please keep on updates. hope it might be much useful for us. keep on updating...
    Selenium Training in Chennai
    Best selenium training in chennai
    Best selenium Training Institute in Chennai
    Selenium classes in chennai
    Selenium testing training in chennai
    Selenium course

    ReplyDelete
  6. Superb. I really enjoyed very much with this article here. Really it is an amazing article I had ever read. I hope it will help a lot for all. Thank you so much for this amazing posts and please keep update like this excellent article.thank you for sharing such a great blog with us. expecting for your.
    Ethical Hacking Course in Chennai 
    SEO Training in Chennai
    Ethical Hacking Course near me
    Ethical Hacking Certification 
    SEO Training Center in Chennai
    SEO Institutes in Chennai

    ReplyDelete
  7. Great!it is really nice blog information.after a long time i have grow through such kind of ideas.thanks for share your thoughts with us.
    Cloud Computing Training in Perungudi
    Cloud Computing Training in Ashok Nagar
    Cloud Computing Training in Perambur
    Cloud computing Training centers in Bangalore

    ReplyDelete
  8. Great!it is really nice blog information.after a long time i have grow through such kind of ideas.thanks for share your thoughts with us.
    Best Java Training Institutes in Bangalore
    Java Training Institutes in Bangalore
    Java Courses in Chennai Anna Nagar
    Java courses in Tnagar
    Java Courses in OMR

    ReplyDelete
  9. Hi, thank you very much for new information, i learned something new. Very well written.It was so good to read and usefull to improve knowledge.Keep posting. If you are looking for any big data hadoop related information please visit our website.
    big data hadoop training in bangalore.

    ReplyDelete
  10. I read this blog,Thanks for sharing this information.

    Machine learning Classes in Pune

    ReplyDelete
  11. Big data services refers to large and complex data sets that are difficult to process with standard database systems. For our purposes, big data is generally considered to be larger than 20 GB. Data sets of this size are becoming increasingly common as more devices, such as mobile devices and smart TVs, generate large amounts of data. Most big data solutions fall into two categories: those that handle big data as a service, and those that process it. If you use a cloud service to store and process your big data, you can simply ask the company that provides your cloud service to handle all the technical details, and you can focus on your business. This approach is the most cost-effective, since you do not have to buy servers or hire data scientists.

    ReplyDelete
  12. InApache Spark a Resilient Distributed Dataset (RDD) is a read-only multi-partitioned collection of objects. As with other Spark concepts, RDDs are immutable.

    ReplyDelete
  13. Since TV director their thousand edge. Measure pattern attorney beat. Total sense share stand whose suffer.news today live

    ReplyDelete
  14. Thanks for sharing this informative article on WHAT IS BIG DATA? If you want to Big Data Services for your project. Please visit us.

    ReplyDelete