The proposal for a Big Data solution is to offer a
consistent approach to addressing the constant growth and complexity of data.
To do so, the concept considers the 5 V's of Big Data: Volume, Velocity, Variety,
Veracity and Value.
Volume: The volume concept in Big Data is best evidenced
by everyday facts: daily volume of exchange of emails, banking transactions,
interactions in social networks, record of calls and data traffic in telephone
lines. All these serve as starting points for understanding the volume of data
present in the world today.
It is estimated that currently the total volume of data
circulating on the Internet is 250 Exabytes per year. Every day 2.5 quintiles of bytes are created in data form, currently 90%
of all data that is present in the world was created in the last 2 years (IBM).
It is also important to understand that the concept of volume is relative to
time variable, that is, what is great today, may be nothing tomorrow. In the 1990s, a Terabyte was considered Big
Data. In 2015, we will have around the world approximately a volume of digital
information of 8 Zettabytes, an infinitely greater value.
Velocity (Speed): Would you cross a blindfolded street if the
last information you had was a photograph taken from traffic circulating 5
minutes ago? Probably not, because the 5 minute photo shoot is irrelevant, you
need to know the current conditions to be able to cross the street safely.
(Forbes, 2012) The same logic applies to companies as they need current data on
their business, ie speed. According to Taurion (2014) the importance of speed
is such that at some point there must be a tool capable of analyzing the data
in real time. Currently, data are only analyzed after they are stored, but the
time taken for storage itself already disqualifies this type of analysis as a
100% real-time analysis.
Information is power, and so the speed with which you get
this information is a competitive advantage of companies. Speed can limit the
operation of many businesses, when we use the credit card for example, if we do
not get a purchase approval in a few seconds we usually think of using another
payment method. It is the operator losing a business opportunity by the failure
in the speed of transmission and analysis of the data of the buyer.
Variety: Volume is just the beginning of the challenges of
this new technology, if we have a huge amount of data, we also get the variety
of them. Have you thought about the amount of information scattered in social
networks? Facebook, Twitter and others have a vast and distinct field of
information being offered in public at every second. We can observe the variety
of data in emails, social networks, photographs, audios, telephones and credit
cards. Whatever the discussion, we can get infinite views on it. Companies that
can capture the variety, whether from sources or criteria, add value to the
business. Big Data scales the variety of information in the following ways:
Structured data: are stored in databases, sequenced in
tables;
Semi-structured data: follow heterogeneous patterns, are
more difficult to identify because they can follow different patterns;
Unstructured Data: A mix of data with diverse sources such
as images, audios and online documents.
Of these 3 categories, it is estimated that up to 90% of
all data in the world is in the form of unstructured data.
Veracity: One in three leaders does not trust the data
they receive. In order to reap good fruits from the Big Data process it
is necessary to obtain true data, according to reality. The concept of
velocity, already described, is well aligned with the concept of veracity by
the constant need for real-time analysis, that is, of data that are consistent
with the reality of that moment, since past data can not be considered true
data for the moment Which is analyzed. The relevance of the data collected is
as important as the first concept. The verification of the data collected for
adequacy and relevance to the purpose of the analysis is a key point to obtain
data that add value to the process.
Value: The greater the wealth of data, the more important
it is to know the right questions at the beginning of the analysis process. It is necessary to be focused on the direction of the
business, the value that the collection and analysis of the data will bring to
the business. It is not feasible to complete the entire Big Data process if you
do not have questions that help the business realistically. In the same way it
is important to be aware of the costs involved.
No comments:
Post a Comment