Tapping into SAP ERP: Rapidly Building ETL Processes with Data Contextualization

translate SAP fields to friendly fields

SLT and BODS do not always make things easier

In this blog post, I want to show how easy it can be to tap into ERP data from SAP, i.e. find the relevant data, read, filter, enrich and contextualize it, and finally store the data natively on HIVE for further data processing.

You may appreciate this run-through if you are performing the same tasks with SAP out of the box solutions, such as a combination of SLT and BODS. In their own right these SAP solutions are extremely powerful and especially BODS has a wide scope. However, for integration between SAP’s business solutions where you need to unravel the mysteries of SAP’s ERP data model on top of performing delta capture and run ETL processes, we feel that the solution with SLT and BODS tends to be overcomplicated.

An alternative solution

In order to enrich a data lake (e.g. HIVE tables on Hadoop) using Datavard Glue, you need to perform a one-time setup of Datavard’s storage management component. This is similar as creating a SAP RFC-connection: you let Glue know what storage technology is where.

Next, you can use Glue’s BPL (Business Process Library) to identify relevant SAP tables. Or, if you know which data you want, you can simply go for the tables directly. The beauty of this solution is that you will find plenty of content in the BPL above and beyond information on SAP tables. For example, you can automatically translate SAP fields to friendly fields, and using the BPL Business Functions you can implement lookups for data contextualization without the need for any coding.

The solution is very light-weight, and since it is implemented in SAP’s own ABAP technology integrates natively. When it comes to agility and security, this has major advantages. I will discuss the topic of security in the next blog post of this series. However, when it comes to agility, you can literally create a data model for your business data on Hadoop (e.g. for financial documents), set up a data flow, and run that data flow within minutes.

The individual steps for this are:

  1. Optional: use the Glue BPL (Business Process Library) to identify the relevant table you want to tap into.
  2. My recommendation would be (for transactional data) to use the line item table (e.g. BSEG) as a template and create this table on Hadoop using the Glue BPL “friendly fields” option.
  3. Add fields from the header table you may need (e.g. the posting date, user, document type, etc. from BKPF).
  4. Create an extractor with a 1:1 field mapping between SAP and Hadoop. For the fields from the header table, you can easily use the BPL functions in the extractor to populate the additional fields.
  5. Create & execute a variant for your extractor.

These steps don’t take much time or effort. The trick is to leverage the Glue BPL, along with the SAP DDIC, to unlock the SAP data. This screenshot shows an example for how a HIVE table with SAP Finance data can look on Hadoop using the BPL “friendly fields” function:

hive table on hadoop datavard glue

To populate this table, we simply use a field mapping and populate additional fields using BPL Business Functions, e.g. “POSTINGDATE_OF_FIDOC”. These can be easily identified through the search help for business functions.

datavard glue

In a nutshell, these functions work like lookups, but with the additional benefit that anybody who can use a tool like MS Excel with calculation functions in cells can use the business functions. This way, you can implement lookups into other tables – without knowing the table names and without knowing how to code in ABAP!