Self-service business intelligence and data lakes are one of the most important priorities of CIOs and CDOs, when it comes to creating new business value for their organizations. The trend is to make data available and transparent to enable business users to get answers without having to involve IT. This is a real opportunity but in many organizations the business data is stored in various SAP systems (SAP ERP, S/4HANA, SAP CRM, SAP SRM, etc.).

Data in traditional SAP archive solutions does not contribute to better business decisions

SAP systems have been around for decades, unlike most on-premises (Hadoop) or cloud-based (Google, Azure, AWS) data lakes, and often large portions of historical data is archived. Historical SAP archiving solutions store data in file-based storages in a compressed format and it is difficult to incorporate this data into corporate data lakes, let alone run real-time analytics, machine learning algorithms, and create real business value from it.

“Stop being a prisoner of your past. Become the architect of your future.”

Robin Sharma

OutBoard ERP Archiving can migrate or archive SAP aged data into corporate data lakes. Already, more than 40 of the Fortune 500 companies rely on Datavard solutions to bridge SAP with big data lakes, which supports having historical and recent SAP data stored within a single corporate data lake. What is the business value of a data lake without SAP data, and what is the value of SAP data without historical SAP archives? Often business data in SAP S/4HANA gets archived as quickly as after 2 years due to rising costs of SAP HANA. This makes providing the historical SAP archive for further self-service business intelligence key.

Archived data accessible in data lake for further consumption via PowerBi, Tableau, etc.
Archived data accessible in data lake for further consumption via PowerBi, Tableau, etc.
SAP connected with data lake (SAP HANA and SAP historical archive)
Archived data accessible via SAP transactions via ArchiveLink. Faster access depending on data lake technology compared to traditional archive solutions.

The power of the data lake and why SAP data is key

According to AWS, a data lake is a centralized repository that allows you to store all your structured and unstructured data at any scale. This guides better business decisions as you can store data as-is, without having to first structure the data. You can run different types of analytics—from dashboards and visualizations, to big data processing, real-time analytics, and machine learning. Based on Microsoft Azure, data lakes are also a cost-effective solution to running big data workloads. You can choose between on-demand clusters or a pay-per-job model when data is processed. Data lakes scale up or down based on business needs, and independently scale storage and compute, enabling more economic flexibility. Google says, that the data lake is not just storage, and it is not the same as a data warehouse. Data lakes provide a scalable and secure platform that allow enterprises to ingest any data from any system at any speed.

As mentioned by Goetz Lessmann, there is a strong trend towards integrating SAP with data lakes, as business reporting without data coming from SAP systems is nearly impossible. The vast majority of SAP customers on S/4HANA or in the migration towards S/4 need to significantly reduce their HANA footprint and closed business documents are archived as quickly as after 2 years. Considering this, it is difficult to imagine data lakes and big data analytics without data from historical SAP archives (3-10 year old business data).

Typical architecture with SAP historical data integrated into the data lake

Here comes the solution: data lakes enabled with a full set of SAP data, recent hot data but also with historical SAP data. Structured data from SAP is combined with structured and un-structured data coming from other data sources (IoT, Social media, non-SAP enterprise software, 3rd party or custom applications) and is enabled for big data processing and self-service business intelligence, to create additional business value and provide information for the right business decisions.

SAP connected with data lake (SAP HANA and SAP historical archive)
SAP connected with data lake (SAP HANA and SAP historical archive)

More and more companies are looking to enable all enterprise data in any data lake technology. OutBoard ERP Archiving is a holistic archiving solution that moves data between the SAP database and external storage, regardless of the storage vendor (e.g. cloud-based or on-premises data lakes) according to its usage or age of data. OutBoard ERP Archiving is the only available solution that makes archived data available for further data analytics in the cloud data lake, because historical data can be provided in transparent format in several data lake formats, such as Hadoop HIVE, Impala, AWS Redshift, Azure Data Lake Service, Azure Databricks, Google Big Query, Snowflake, etc. Active data remains in the database during daily operations, cold or old data is archived. Archive data can still be used for reports. In the data lake all SAP data, including historical data is enabled and extended with non-SAP data (e.g. customer attributes) and helps to guide better business decisions.

Others also read

Nowadays, data is growing at an exponential rate and that leads to increased costs for maintenance, licenses, and storage. A past analysis of SAP document attachments growth showed an increase of 700,000 attachments per month in a single SAP ECC system. This corresponds to a monthly storage and licenses ...
Archiving and Decommissioning
Cloud
Data growth and TCO
Data Lake
Data Management
Datavard OutBoard
Avatar
Tomas Lazar
The latest trends in the SAP world point to the importance of gathering structured and unstructured data from various sources, e.g. SAP systems, file systems or social media, in one place – the data lake. This will enable further analytics, diagnostics, and predictions thus helping to make business decisions ...
Archiving and Decommissioning
Cloud
Data growth and TCO
Data Lake
Data Management
Datavard OutBoard
Avatar
Tomas Lazar
Data today is growing, and SAP data is no exception. According to Datavard’s analysis of more than 300 SAP systems, the annual data growth is between 20-40%. Some organizations fail to implement archiving of old SAP data, data retention and housekeeping processes into organization’s standard IT processes. Often archiving ...
Archiving and Decommissioning
Cloud
Data growth and TCO
Datavard OutBoard
Jan Meszaros
Jan Meszaros