Wednesday, May 10, 2017

The 5 V's in Big Data

The proposal for a Big Data solution is to offer a consistent approach to addressing the constant growth and complexity of data. To do so, the concept considers the 5 V's of Big Data: Volume, Velocity, Variety, Veracity and Value.

Volume: The volume concept in Big Data is best evidenced by everyday facts: daily volume of exchange of emails, banking transactions, interactions in social networks, record of calls and data traffic in telephone lines. All these serve as starting points for understanding the volume of data present in the world today.

It is estimated that currently the total volume of data circulating on the Internet is 250 Exabytes per year. Every day 2.5 quintiles of bytes are created in data form, currently 90% of all data that is present in the world was created in the last 2 years (IBM). It is also important to understand that the concept of volume is relative to time variable, that is, what is great today, may be nothing tomorrow. In the 1990s, a Terabyte was considered Big Data. In 2015, we will have around the world approximately a volume of digital information of 8 Zettabytes, an infinitely greater value.

Velocity (Speed): Would you cross a blindfolded street if the last information you had was a photograph taken from traffic circulating 5 minutes ago? Probably not, because the 5 minute photo shoot is irrelevant, you need to know the current conditions to be able to cross the street safely. (Forbes, 2012) The same logic applies to companies as they need current data on their business, ie speed. According to Taurion (2014) the importance of speed is such that at some point there must be a tool capable of analyzing the data in real time. Currently, data are only analyzed after they are stored, but the time taken for storage itself already disqualifies this type of analysis as a 100% real-time analysis.

Information is power, and so the speed with which you get this information is a competitive advantage of companies. Speed ​​can limit the operation of many businesses, when we use the credit card for example, if we do not get a purchase approval in a few seconds we usually think of using another payment method. It is the operator losing a business opportunity by the failure in the speed of transmission and analysis of the data of the buyer.

Variety: Volume is just the beginning of the challenges of this new technology, if we have a huge amount of data, we also get the variety of them. Have you thought about the amount of information scattered in social networks? Facebook, Twitter and others have a vast and distinct field of information being offered in public at every second. We can observe the variety of data in emails, social networks, photographs, audios, telephones and credit cards. Whatever the discussion, we can get infinite views on it. Companies that can capture the variety, whether from sources or criteria, add value to the business. Big Data scales the variety of information in the following ways:

Structured data: are stored in databases, sequenced in tables;
Semi-structured data: follow heterogeneous patterns, are more difficult to identify because they can follow different patterns;
Unstructured Data: A mix of data with diverse sources such as images, audios and online documents.
Of these 3 categories, it is estimated that up to 90% of all data in the world is in the form of unstructured data.

Veracity: One in three leaders does not trust the data they receive. In order to reap good fruits from the Big Data process it is necessary to obtain true data, according to reality. The concept of velocity, already described, is well aligned with the concept of veracity by the constant need for real-time analysis, that is, of data that are consistent with the reality of that moment, since past data can not be considered true data for the moment Which is analyzed. The relevance of the data collected is as important as the first concept. The verification of the data collected for adequacy and relevance to the purpose of the analysis is a key point to obtain data that add value to the process.


Value: The greater the wealth of data, the more important it is to know the right questions at the beginning of the analysis process. It is necessary to be focused on the direction of the business, the value that the collection and analysis of the data will bring to the business. It is not feasible to complete the entire Big Data process if you do not have questions that help the business realistically. In the same way it is important to be aware of the costs involved.

No comments:

Post a Comment