Big Data could be the driver of the 21st century, but…

Big Data Datavard

In his recent blog post, Tim Oehme mentioned various definitions of “Big Data,” depending on your business perspective. I’d like to focus on the Big Data term itself, as I believe it is quite misleading. When people hear Big Data, they focus on the word “Big” when they should actually turn their attention to the word “Data.”

Volume: when size doesn’t matter

Volume is the first thing that springs to the mind when you hear Big Data. And it used to be quite scary, with predictions — such as IDC’s — that an enterprise’s amount of data would double every 18 months, eventually collapsing its IT infrastructure. But, don’t be fooled by vanity metrics. While I don’t want to downplay this point, based on our experience, the sheer volume of data is something that can be handled relatively easily. Advances in data base management systems, such as Hadoop and NoSQL, decreases in hardware pricing as well as smart data management all help with this process.

Velocity: it’s loading, please wait…

Velocity is the speed at which data is created, processed and consumed. And here it gets tricky because most data is generated by machines or sensors at unbelievable speed. As an example, a Boeing 737 generates 20TB of data per one hour of flight. How on earth do you physically get this data into the data center for processing? It might be that 99 percent of the data is rubbish, but you don’t want to miss that one glitch that forebodes future trouble.
Interestingly enough, when we talk with our customers, most of them experience performance issues during data ingestion, not in reporting. Of course, this is because the concept of ETL tools (extract, transform, load) to cleanse, format and transform the data arose when expectations toward reporting speed and “real-time” were to be done in batch processing. But, with the current adoption of in-memory computing like SAP HANA and new use cases, this poses new challenges on IT teams … data streaming.

Variety: blame video bloggers

Variety refers to different types of data you want to process. Unfortunately, it’s no longer just nicely structured data like invoices, purchase orders or time recordings. Rich media like speech or videos (think of surveillance cameras or all of the YouTube data out there) not only contribute the lion’s share to the overall volume of stored data, but also pose significant challenges for processing.
You can be sure that at this moment, somewhere in the world, someone is racking their brains about how to capture and analyze data generated by a video blogger who’s reviewing their company’s product. Or, someone may want machines to recognize road signs for driverless cars or read product logos. As an example of the latter, SAP Chairman Hasso Plattner presented a real-time analysis of advertising banners at a hockey game based on SAP Leonardo during his presentation at SAPPHIRE NOW. An algorithm scanned the video footage in real-time and calculated how long and how prominently each brand was displayed during the game. This solution could revolutionize the marketing industry and introduce new pricing models. Because this tool allows companies to ascertain to the second how often and how long their logos appear during a game, they would know precisely how much to pay for advertising. Most importantly, this solution would help them to more precisely calculate their advertising ROI.

Viability: where the 4 Vs meet the 4 Ms

To me, the biggest challenge with Big Data is far outside of the technical arena. According to Gartner, 90 percent of data lakes will become useless because of an unclear use case. Even if an enterprise has all the data it needs and can process it at the right speed, it doesn’t mean that the processed data will generate the expected business outcome. This is the moment of truth: when the 4 Vs (volume, variety, velocity and viability) meet the 4 Ms (make me more money). Ultimately, digital transformation is more about business transformation than anything digital. Acceptance, skills, change readiness, resources and organizational readiness are by far the most critical factors for identifying, assessing and monetizing a use case.

Here are seven success factors from recent and not-so-recent experiences with Datavard customers:

1. Get business and IT in one boat and build a common base of knowledge.

The better you are at educating the entire organization on current technological innovations, the faster you will become educated on business priorities and, hence, create the right business outcomes. This also lays the foundation for acceptance (see #6) as well as cross-departmental and interdisciplinary teamwork.

2. Best of breed is back and extends beyond commercial components.

The current radical change in your IT stack is unprecedented and forces you to make smart architectural decisions. Established technologies and applications are challenged holistically (e.g. by Hadoop or Kafka) as well as for single use cases (niche offerings from the cloud such as SAP’s Hybris, Concur or Ariba). It’s your job, as the IT leader, to evaluate each part of your architecture regarding its business impact, future-readiness, cost, skill requirements and make-or-buy requirement. We have dedicated a part of our business development team to keeping a close eye on the open source community. Many de facto industry standards such as Kafka, Hadoop, Spark, and others offer a significant value proposition and are backed up by an impressive developer community.

3. Prioritize data from an ever-growing array of technological innovations

Virtual assistants, natural language processing or virtual/augmented reality as well as data systems and infrastructure layers with increased virtualization. This is posing a massive challenge for IT leaders to evaluate and to correctly separate “tech gadgets” from enterprise-ready technologies with the power to create or support business outcomes.

4. Establish a smart overall objective.

This is very eloquently outlined by Andreas Geiger of Hella in his recent blog on digital transformation. I also want to add that IT leaders are still willingly accepting a significant amount of pressure, burden and responsibility without any realistic chance of success. This is all rooted in planning for unrealistic business outcomes (see 1), budgets (see No. 5) and resources. Also, you need to put in place a system that continuously provides insights on real-life adoption of the use case and the business outcome as well as smartly manages deviations.

5. Be clear that the nature of the game is your organization’s ability to experiment.

Listen, learn and revise iteratively rather than predicting outcomes beforehand. A naked truth that is hard to accept is that:
a. You can’t predict the outcome of analyzing massive data sets without actually doing it, and,
b. The waterfall model is useless in highly uncertain environments with rapidly changing expectations and circumstances.
I can highly recommend Eric Ries’ The LEAN startup which is one of the most inspiring books that I’ve read on this topic.

6. Value past experiences and approaches of Change Management.

Very often I observe people trying to solve new problems with new tools. This is not necessarily the best way. The barriers of all digital initiatives that I’ve been involved in were the same that I experienced in the 1990s when business process reengineering during an SAP R/3 implementation was the BIG thing. One of the best and timeless Change Management principles is that Success = Quality times Acceptance. This is not new. The Change Management tools and methods to overcome fears and concerns, bridge skill gaps and reach a state of organizational readiness are in place and well known.

7. Make or buy.

The higher the degree of innovation and competitive differentiation you are looking for, the more you depend on your team’s ability to build software IP based on existing open-source or commercial platforms such as SAP, Google, Microsoft or WSO2.

To sum up, Big Data could be the big driver of the 21st century, but… it’s not there yet.

Leave a Reply

Your email address will not be published. Required fields are marked *