Self-service business intelligence and data lakes are one of the most important priorities of CIOs and CDOs, when it comes to creating new business value for their organizations. The trend is to make data available and transparent to enable business users to get answers without having to involve IT. This is a real opportunity but in many organizations the business data is stored in various SAP systems (SAP ERP, S/4HANA, SAP CRM, SAP SRM, etc.).
Data in traditional SAP archive solutions does not contribute to better business decisions
SAP systems have been around for decades, unlike most on-premises (Hadoop) or cloud-based (Google, Azure, AWS) data lakes, and often large portions of historical data is archived. Historical SAP archiving solutions store data in file-based storages in a compressed format and it is difficult to incorporate this data into corporate data lakes, let alone run real-time analytics, machine learning algorithms, and create real business value from it.
“Stop being a prisoner of your past. Become the architect of your future.”
OutBoard ERP Archiving can migrate or archive SAP aged data into corporate data lakes. Already, more than 40 of the Fortune 500 companies rely on Datavard solutions to bridge SAP with big data lakes, which supports having historical and recent SAP data stored within a single corporate data lake. What is the business value of a data lake without SAP data, and what is the value of SAP data without historical SAP archives? Often business data in SAP S/4HANA gets archived as quickly as after 2 years due to rising costs of SAP HANA. This makes providing the historical SAP archive for further self-service business intelligence key.
The power of the data lake and why SAP data is key
According to AWS, a data lake is a centralized repository that allows you to store all your structured and unstructured data at any scale. This guides better business decisions as you can store data as-is, without having to first structure the data. You can run different types of analytics—from dashboards and visualizations, to big data processing, real-time analytics, and machine learning. Based on Microsoft Azure, data lakes are also a cost-effective solution to running big data workloads. You can choose between on-demand clusters or a pay-per-job model when data is processed. Data lakes scale up or down based on business needs, and independently scale storage and compute, enabling more economic flexibility. Google says, that the data lake is not just storage, and it is not the same as a data warehouse. Data lakes provide a scalable and secure platform that allow enterprises to ingest any data from any system at any speed.
As mentioned by Goetz Lessmann, there is a strong trend towards integrating SAP with data lakes, as business reporting without data coming from SAP systems is nearly impossible. The vast majority of SAP customers on S/4HANA or in the migration towards S/4 need to significantly reduce their HANA footprint and closed business documents are archived as quickly as after 2 years. Considering this, it is difficult to imagine data lakes and big data analytics without data from historical SAP archives (3-10 year old business data).
Typical architecture with SAP historical data integrated into the data lake
Here comes the solution: data lakes enabled with a full set of SAP data, recent hot data but also with historical SAP data. Structured data from SAP is combined with structured and un-structured data coming from other data sources (IoT, Social media, non-SAP enterprise software, 3rd party or custom applications) and is enabled for big data processing and self-service business intelligence, to create additional business value and provide information for the right business decisions.
More and more companies are looking to enable all enterprise data in any data lake technology. OutBoard ERP Archiving is a holistic archiving solution that moves data between the SAP database and external storage, regardless of the storage vendor (e.g. cloud-based or on-premises data lakes) according to its usage or age of data. OutBoard ERP Archiving is the only available solution that makes archived data available for further data analytics in the cloud data lake, because historical data can be provided in transparent format in several data lake formats, such as Hadoop HIVE, Impala, AWS Redshift, Azure Data Lake Service, Azure Databricks, Google Big Query, Snowflake, etc. Active data remains in the database during daily operations, cold or old data is archived. Archive data can still be used for reports. In the data lake all SAP data, including historical data is enabled and extended with non-SAP data (e.g. customer attributes) and helps to guide better business decisions.