Data-driven: is it just a marketing buzzword?
I was amused to learn that almost everybody in the IT industry now claim to “enable the Data-Driven Enterprise!” But when I ask our customers “are you data driven?” I would usually get the answer “nah, not really… and what does that mean anyway!?”. However, when I ask them “well then, how do you create make your decisions if not based on facts and data…?”, then they will quickly agree that of course they make the decisions based on sound analysis. After all, they have the dashboards, their Business Warehouse and plenty of experience of how to combine all the bits and pieces of data they have… right? But the good old business warehouse as we know it is close to retirement, and the fast pace of technological changes doesn’t make it easier to make sense of it all.
The classical data warehouse no longer has all the answers
The concept of the Business Warehouse dates back to the 90s. It was designed to overcome the limitations of classical ERP systems and databases. The BW system collects data from several sources, aggregates it and “hammers” into the shape which is required for reporting purposes. The typical questions back in the day have been “what has happened?“, and “why did that happen? “. Now, 20+ years later, questions go above and beyond that, and some modernization is in order to provide users with valid information and answers.
The classical data warehouse / business warehouse has reached its limits and simply cannot keep up with today’s requirements.
Business warehouse is too slow, too clumsy and too limited
The classical Business Warehouse has lost quite some of its original appeal. Now it is seen as clumsy and bulky. Feedback goes in the direction of usability, but also clarity and ability to deal with a multitude of heterogeneous data sources. Another problem is the complexity of the related development processes. In a classical Business Warehouse, it takes quite a while until any new development is live, on production, and filled with data and meaning (everybody who has ever built a data model in e.g. a SAP BW and filled it with data from transactional systems, and then put some queries on top and moved it from a development system to production will know what I’m talking about!)
At the same time, users are also more mature than they used to be 20 years ago in terms of technical versatility. Back in the day, Excel skills may have been considered the prime of business analytics. Nowadays business users are asking (among other features) for self-service BI. To some degree, this is of course a consequence of the complicated and lengthy development processes. The resulting frustration of business users simply leads them to the demand of “hey! Just let me build it myself!”.
Organizations have reacted to these challenges in multiple ways. Typically, the reaction will be automation to achieve improvements and acceleration… The reasoning behind that is fairly simple: if data management processes and data acquisition process are automated, if they are executed more frequently, then the time for new developments will naturally be faster as well, and business users will get required results faster. Of course, also TCO considerations are heavily based on automation because with higher degrees of automation systems and landscapes can be operated at a budget.
Today, reporting is no longer about showing historical data. It has to offer forecasting features backed by data from multiple sources.
From a data perspective, there are new requirements as to what to do with the data. While the past business warehouses were mostly used for classical reporting and dashboarding (“what has happened and why?”), nowadays the requirements include predictive and prescriptive analytics (“what will happen, and what should I do about it?”). While data warehouses can keep up with his challenges to some degree, it is always a struggle. Other technologies required are search & discovery, natural language processing, and text analysis. Ultimately, new technologies and tapping into new data sources are key to implement such scenarios successfully in a truly data driven enterprise.
Data lake complements ERP and BW: welcome to your new, data-driven playground
“Data lake” is a logical complement the classical world of ERP and transactional data, and the world of BW and classical reporting. While the term “data lake” may be over-used today after the hype of the last years it is still a viable concept to collect structured and unstructured data from various sources. This is one of the reasons why I personally prefer the term “Data Management Platform” over the good old data lake.
A truly data driven enterprise consists of three pillars:
- A classical process driven area with transactional data, mostly based on business processes and ERP. This part of the enterprise can be called the “process factory”
- An “information factory” which is based on the data warehouse and reporting.
- A “data lab” which takes available data to the next level – i.e. data which is available from processes and reporting as well as external data sources. Such a data lab will be the domain and playground of data scientists with the appropriate tools and technologies of course.
Clean Master Data is the Foundation of the Data-Driven Enterprise
Advanced analytics being run on a data management platform requires the standardization of day to day data management processes. Such processes range from data acquisition, classical ETL, and simple lifting and shifting of data, to more complex areas such as data science, and the integration of an enterprise-wide data catalog serving as a dictionary to know what data is available where, coupled with access management. In the age of the exponential data growth and GDPR, finding and providing appropriate data access is more important than ever before.
No surprise, leading studies and institutions such as BARC consistently list Master Data Management, Data Quality, Data discovery, Data Governance, and the overall challenge of establishing a data driven culture as the most important trends over the last couple of years. The topic of self-service BI only follows after these priorities…
Data-driven enterprise calls for a new skill set and cross-functional teams
In the data driven enterprise, rules not only for data governance, but also for the collaboration of experts of different areas will be required. Such experts include business analyst, architects, data scientists, and end users. Often, the majority of end-users will be data producers. Such users work on the transactional side of data, ranging from data entry to day-to-day business processes. Power users and the architects will be working in the information part of the data driven enterprise, for example building data pipelines and data models in a data warehouse or even on the data lake. Finally, data scientists and data engineers will be working on the data management platform (call it “Data Lake” if you will…). Important is that bi-directional feedback loop between data producers and data scientists.
How to support all three pillars of a data-driven enterprise
To be truly successful, any enterprise which considers itself “data-driven”, will not only need such experts and processes, but also a sound foundation on which to operate and store the data. It’s Datavard, we support enterprises in the transformation to become “data-driven”, as well as optimizing data management processes above and beyond this transformation. We do this through 20 years of experience and the set of tools and technology which we have built in these years.
Datavard technologies help on all three pillars of the data-driven enterprise:
- For the process factory: the Data Transformation Suite for all kinds of data transformation, for example to adapt data to changes in business processes or the organizational structure of the enterprise (e.g. in the context of mergers and acquisitions or any S/4HANA or B/4 move as part of the modernization efforts of our customers).
- For the Information Factory: of the data-driven enterprise, our OutBoard suite for SAP facilitates dealing with data of different levels of relevance (ancient data which is considered “cold”, warm data, and hot data which is required at the speed of light by users), e.g. to archive data while keeping it accessible.
- For the “data lab”: we build a bridge between different technologies and implement business processes and data management processes (through Datavard Glue). For example, identifying the relevant information from the process factory, exposing it to a data management platform, and cataloguing that dinner date a catalogue are vital pieces of the puzzle to truly turn data science into tangible benefits for day-to-day business.