With Datavard Glue we have a strong solution for tight, native, integration of SAP systems and data lakes running on big data platforms. Many companies decide to run their Data Lake in the cloud – obviously this addresses the topic of scalability very well, while minimizing the work of running, patching, and updating components on Hadoop. While there are many different cloud providers offering Hadoop or similar technologies in the cloud, many SAP customers prefer Microsoft Azure over these offerings.
One not very obvious question is which product from Microsoft’s Azure offering to choose for the SAP integration. Looking at the offering there are several options when you plan to integrate your SAP landscape to Microsoft Azure:
- Plain storage: which means using Azure as additional space to your SAP landscape. This is a good choice for storing cold and aged SAP data to simply offload the data to improve SAP system operation and TCO. This option also has the potential to connect a reporting solution directly with this storage space. Available storage options are Azure Blob Storage and Data Lake Storage.
- Database as a service allows companies to rent a database in the cloud, and to use it as a middle layer to connect to further processing or application development. Here you have a variety of databases, ranging from PostgreSQL to MsSQL.
- Analytics platform: this is a platform for data processing with several options including Hortonworks Hadoop (e.g. a HDInsight cluster), the Databricks Spark fork, or Microsoft SQL data warehouse.
All those options bring various pros and cons that would need to be evaluated based on a concrete use case. For example, to create a reporting dashboard for your purchase orders, you might decide for a different option than for user behavior analyses. So, what’s important is to use a flexible solution to keep options open and the future integration scenarios as wide as possible. With Datavard Glue and its modular storage management you can do exactly this.
For example, to connect a SAP ERP/BW system with HDInsight and Azure Data Lake storage, the Storage Management Layer of Datavard Glue needs to be configured on the SAP system with the correct logon and connection data for the cloud solution:
- HDInsight: setup connection to underlying storage and Hive database login.
- ADLS: an Azure user with access and security certificates is required
Once the configuration is completed, it is possible to set up and execute data flows for extraction of SAP data, including transformations, contextualization, lookups, enrichment, cleansing, masking, etc…
like it? share it!