The latest trends in the SAP world point to the importance of gathering structured and unstructured data from various sources, e.g. SAP systems, file systems or social media, in one place – the data lake. This will enable further analytics, diagnostics, and predictions thus helping to make business decisions easier. Cloud-based storage options from Google, Microsoft and Amazon or on-premise Hadoop data lakes are suitable platforms which can be integrated with SAP ECC, SAP S/4HANA, SAP BW, SAP BW/4HANA and SAP HANA Native, using Datavard’s solutions Outboard DataTiering, OutBoard ERP Archiving, Glue or DataFridge. The typical architecture with SAP historical data integrated into the data lake as well as a demo of archiving financial documents are outlined in the blog post written by Jan Meszaros.   

In this blog, I will elaborate on two important aspects of moving historical SAP archives to a cloud based or onpremise data lake. The first aspect is the architecture with two possible integration scenarios of SAP with a data lake. The second aspect is the security when connecting SAP with data lakes. 

Integration scenarios and their key benefits 

Based on the complexity of SAP ECC, S/4HANA landscapes, we recommend two possibilities for the integration of Outboard ERP Archiving as a holistic archiving solution that moves data between the SAP database and external storage. Both are regardless of the storage vendor (e.g. cloud-based, or on-premises data lakes) and according to usage or age of the data within a customer’s landscape. 

The first option is a centralized architecture, where Outboard ERP Archiving is installed on an SAP system which is not subject of the archiving. The systems so called “client systems”, which are subject of the archiving, communicate with Outboard ERP Archiving deployed on the central system via the ArchiveLink interface within an internal corporate network. A centralized architecture is recommended for organizations with complex SAP landscapes containing multiple SAP production systems. As you can see in the below infographic, one of the key pros of a  centralized deployment, is the fact that the archive service is installed only on the central system. Archiving of the aged data or migration of the historical archive, however, is enabled from all connected clients. 

Centralized architecture – Archive Service runs on a separate SAP instance, and SAP systems that are archived are connected
Centralized architecture – Archive Service runs on a separate SAP instance, and SAP systems that are archived are connected

The second integration type is a decentralized architecture, where Outboard ERP Archiving is installed on each client system. This second type of deployment is recommended for organizations with only one production system line. The biggest advantage here, is having the archiving client and archiving service on the one SAP system, which mitigates any potential bottleneck in the network connection between the central system and the client. An additional dedicated SAP system for hosting Outboard ERP Archiving is not necessary.

Decentralized architecture – Archive Service runs on each SAP system to be archived.
Decentralized architecture – Archive Service runs on each SAP system to be archived.

A security concept matters

Security of the archived data and the communication interfaces is crucial nowadays, especially when connecting with cloud data lakes. Both centralized and decentralized architectures are built on secure communication between archiving client and archiving service, enabled by the ArchiveLink Signature concept and Secure Network Communication (SNC). Since the outbound communication or communication towards storage media is often outside of the internal corporate network, it must always be protected. The secure communication between Outboard DataTiering and cloud solution is achieved by using a secure protocol e.g., HTTPS, TCP/IP with SSL, Secured NFS, depending on the API used by the storage connector and platform specific authentication/authorization concept, e.g., Kerberos, Shared Access Signature (SAS) Token, Active Director together with user permission management

OutBoard ERP Archiving is the only available solution that enables secured storing of archive data in the cloud data lake and makes it available for further data analytics.

Others also read

Reducing the footprint of BW on HANA systems When running a SAP Business Warehouse, one of the first-choices for integrating SAP with Big Data is NLS (Nearline Storage). NLS is a way to move data out of the primary database and store it in a more cost efficient, but slower ...
Archiving and Decommissioning
Business Warehouse
Cloud
Data Management
Datavard OutBoard
Goetz Lessmann
Goetz Lessmann
Choosing a new NLS solution As Informatica recently announced that they are discontinuing support for their ILM Nearline storage, many companies are looking for a suitable replacement. There are a lot of options to choose from – be it the in-house SAP IQ, traditional databases such as Oracle, MSSQL or ...
Archiving and Decommissioning
Business Warehouse
Data Management
Datavard OutBoard
Jan Meszaros
Jan Meszaros
At Datavard, we work with the integration between SAP and Big Data platforms such as Hadoop. Data in SAP systems can’t compete with Big Data in terms of sheer data volume. In comparison to your average Hadoop cluster, SAP systems are rather small. However, in terms of both value ...
Archiving and Decommissioning
Data growth and TCO
Data Management
Datavard OutBoard
Goetz Lessmann
Goetz Lessmann