Sunday, April 20, 2014

Key aspects of a Big Data System

A big data solution must address the five Vs of big data:


  1. Velocity
  2. Variety
  3. Volume
  4. Veracity and
  5. Value

Velocity of the data is used to define the speed with which different types of data enter the enterprise and are then analyzed.

Variety addresses the unstructured nature of the data in contrast to structured data in weblogs, radio frequency ID (RFID), meter data, stock-ticker data, tweets, images, and video files on the Internet.

For a data solution to be considered as big data, the volume has to be at least in the range of 30–50 terabytes (TBs). However, a small amount of data could have multiple sources of different types, both structured and unstructured, that would also be classified as a big data problem.

Veracity refers to the trustworthiness of data. With many sources of data in structured and unstructured form, the quality and accuracy of data is uncontrollable. There can be various errors/complexities such as typos, newly formed abbreviations by digital natives, colloquial speech etc. Big data systems should allows us to work with such type of data.

Value is the final V which we need to consider the most. Until we are able to create a value out of such big data, its of no use. Accurate analysis and algorithms are needed based on special needs of the business. One big data volume can provide different values to different organizations based on the analytics applied on it. 

No comments: