Different sources, different definitions
A few years ago, Dan Ariely said:
“Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it.”
I think Dan Ariely’s idea is spot-on. The “Big Data” term is still not clearly defined and means different things according to different sources:
- Wikipedia on Big Data “(…) data sets are so large or complex that traditional data processing application software is inadequate to deal with them.” Wikipedia Germany says “(…) the data sets are too large (…)”.
- It is also a common opinion that data generated by Facebook, twitter, YouTube, Netflix, Amazon etc. including information about the customer behavior (“customers have also viewed”) is Big Data.
- Companies collect data from various sensors, e.g. from an airplane or from a production line
- And finally, finding correlations between data can also be called Big Data
That is why I separate the terms clearly:
- Big Data Sets (related to the technology and managing the data)
- Social Big Data (human opinions and feelings; mostly unstructured data; has to be translated / transformed into clear information, e.g. good or bad, means 0 or 1)
- Big Data Information (big amounts of mostly clear data, but useless in isolation)
- Big Data Science (delve and research to find “the dots”)
There is nothing sexy about Big Data
When I heard about Big Data for the first time I expected something fancy myself too. But even big data sounds fancy, it isn’t. It isn’t sexy, it isn’t fancy. It’s like “dashboarding”. When I ran BI projects in my earlier time the business always wanted to have a dashboard and all KPI’s available at a glance. Best mobile too. But to really achieve this was hard work and mostly reality check moments.
Big Data can give you insights you would never get on another way. But also don’t expect the big data is telling you the truth by its own or giving you the answer to a question you don’t even know yourself, but brings your business ahead like never before. It is also not true. You need smart guys with a statistic or mathematical background, like controllers. The people who find specifications and give advice to the business. Today they are called data scientists. It also won’t hurt if your data scientist team understands your business.
Exploring Big Data step by step
Big Data is also no slice and dice, usually with no fancy user interface. And most of the times, no fancy user experience.
How does the process of exploring your (big) data look like?
- As soon as you have your data relevant data accessible (Big Data Sets) the data scientist have ideas, assumptions or are simply looking for correlations. They are writing directly scripts and codes and connect, combine or interpret the data on another way and extend them by algorithms. Not everything they find is a real insight. Only one of ten assumptions is true.
- That’s why this insights or assumption has to be proved by the business.
- Correlation is not equal causality*. Correlation means a dependence or association between two events or incidences – even if they are causal or not. It could also be just a random phenomenon. However a causality between two or more facts changes at one effects the other(s).
- If they assign the insights for real the data scientist could continue searching or the insights will be integrated into the business.
At the end, Big Data is trial and error. Big Data mean try fast, fail fast, redo fast. Therefore, you need
- smart guys,
- the right data and
- well running systems.
For the last two things, Datavard is the right partner for you.
*Big Data fun facts
- Magnum ice cream creates mosquito bites” (both mostly appears in summer time)
- Number of people who drowned by falling into a pool correlates with films Nicolas Cage appeared in.
- Most accidents in secondary school happens between midnight and 1 a.m. (When the school reports the accident they mostly know the date only, not the exact time. So, they type in the date and time 0:00)