Frank, you are a BI Expert responsible for Business Development at Datavard. This week you joined Cloudera Sessions in London, focused on Big Data use cases. What are your key takeaways?
Big Data is coming down to earth. We saw a lot of really awesome use cases, like cancer recognition, automated driving, real-time process optimization in manufacturing, customer 360° analysis or fraud protection. My biggest takeaway? That it doesn’t make sense to separate storage from computation. With Cloudera Hadoop, we are able to have all the data in one place, but at the same time we can analyze, predict and simulate data scenarios on the same platform. More and more customers build a kind of data lake where they store all the data, even SAP transactional data, and combine it with unstructured IoT data, social media or user behavior data.
You’ve attended some keynote speeches – what Big Data use cases did you find the most interesting?
From my point of view, there were several great use cases. Cloudera as platform is more and more adopted by healthcare and insurance companies. This enables people to work on new treatments and new methods for analyzing problems. It will allow scientists to get more precise data on healing processes and drug side effects. For me this is something where we all as humans benefit from. But we also see insurance and other industries following the digitalization trend, where business leads to more individual benefits. For example, based on the car sensors insurance companies can build driver profiles and calculate a KPI on how precisely the driver sticks to the rules, how fast they drive around curves and how much risk they take while overtaking others. Based on all of that they can offer trainings, recommendations and even build a gamification version of the insurance policy. The safer the customer drives, the cheaper the insurance gets. The driver earns points, awards or even get money back, if they get the title of “the best driver” in the country.
What organizations can do to adopt a smart approach to working with Big Data?
All industries and organizations can benefit from Big Data. We recommend to start with our Innovation Workshops, where we together define use cases, digital processes and business models. And afterwards we build a first prototype with the modern and state of the art technologies. This supports and enables internal resources to make use of the new technologies and at the same time prove the idea and the business model.
Big Data is already used for predicting customer behavior and improving customer experience. But how can IT benefit from it as well? What’s your idea of Big Data for IT?
IT faces the challenge of keeping up with business expectations – business requires IT to deliver Customer 360° or predictive marketing spend analysis, often on short notice. If you are not used to Hadoop, “R”, Python, Graph-Databases or whatever new technologies, you are lacking possibilities. You will struggle with the new things and either deliver it “old school” using the expertise you have already or try out the new technologies and most probably you will not be able to leverage the full potential in the most effective way.
But if you start with some IT internal Big Data stuff NOW, you could get first experience. Fail early and often to build the best possible solution. For example, analyze all SAP Log entries to see pattern and optimize the system, which will be a benefit for all users and IT. You could write the logs, directly to Hadoop and analyze it with “R”, before reconnect the results to SAP, to visualize them with SAP Lumira. After some iterations, you will be able to actively support marketing, sales, supply chain or other business departments with Big Data ideas, technologies and methodologies.
Datavard recently announced a partnership with Cloudera, the market leader in Hadoop distribution. What do you think are the main benefits of this partnership for the customers?
Cloudera is one of the best Hadoop providers on the market. In terms of security and performance, it is even the most mature and enterprise ready one. With this capabilities, Cloudera and Datavard address the SAP market to provide mainly 2 scenarios. First is data offloading, moving warm and cold data to either Hive or Impala. Second is integration, connecting your data lake (with sensor data, social media streams, documents, pictures and so on) to your SAP system to run predictions on the joined data set. Or move data from SAP to Hadoop in a transparent way, to join it on the Hadoop level. Both ways are possible – depending on your skills and strategy, the one or the other could be better. Therefore, Datavard designed an SAP-based middleware to leverage your SAP knowledge like software distribution, authorizations, data management or monitoring, natively on Hadoop.
Hadoop remains a hot topic in the SAP world, but many are concerned about the Hadoop skills gap, do you think it’s still a challenge?
This is definitely a challenge. We see that a lot of customers recruiting new Big Data teams. But many of them fail on the industrialization of the created big data applications because of lack of established governance, development rules and scalability. Others try to answer the Big Data challenges with their good old in-house resources, technologies and tools.
Neither is ideal. We recommend a mixture of both worlds: use your enterprise proven technologies as core and enhance where needed by the new technologies. Define clear handover rules and quality gates between the data science team and the ongoing and operations team.
Where are you headed next? Which IT topics do you want to explore in 2017?
While Big Data is now recognized as predictive analytics for a huge amount of data, the next logical step is machine learning and artificial intelligence. At the same moment cloud is reaching the Big Data world as well. Cloudera is a fully secured platform, from data movement to data storage, everything is secured, encrypted. This will serve as basis for delivering customers a Cloud based User behavior analytics application for SAP and Hadoop that will answer how users are using the data in your SAP/Hadoop environment, which reports and which data is accessed regularly or only for the month-end closing, how users combine data to create new insights. And on top of it, it will benchmark this information against the peers of your industry or size.