Security Considerations for ERP Data Usage on Hadoop

datavard glue security sap big data hadoop

Previously, I was discussing how to easily and natively integrate between SAP and Big Data platforms using Datavard Glue as the middleware and workbench to identify data, build ETL processes, and consume it on (for example) Data Lakes.

Datavard Glue is natively integrated into SAP’s Business Applications. It is built in the ABAP stack – so the integration comes natural and Glue can use the SAP authorization concept, and the SAP TMS (Transport Management System).

  • Authorizations: This makes Glue very safe to use because the user can define roles and profiles which allow only the “right” user to access data, and it follows SAP best practices for software logistics where the user develops on a development system, tests on a test system, and only when everything is working satisfactorily, move on to the production
  • SAP TMS: by leveraging the SAP Transport Management System for Big Data objects such as tables on Hadoop, data extractors, etc., Datavard Glue allows users to use SAP as it was meant to be used. Development work takes place on an SAP Development system, testing takes place on a SAP QA system, and only after a successful test, Datavard Glue scenarios, data models and ETL data flows are transported to SAP production

This has some serious advantages when comparing to traditional ETL tools.

With a classic ETL tool, users tap directly into the SAP system database. This may sound easy and straightforward but it also opens up Pandora’s box of when it comes to security, such as determining data access rights. It is advisable to limit user access with the ETL tool to a list of tables, and to a subset of the data in these tables.

Because of the advanced – yet simple to implement – security features in Datavard Glue, I like to draw an analogy of ETL tools vs. Glue to protecting a home. Now, with a classical ETL tool, somebody would simply drill a tunnel into your basement, maybe enter your house while you’re not there, walk around your living room, check your music collection, and pick some things from your fridge. If, however, the access happens through Datavard Glue, then this person cannot simply break into the house. They have to ring the doorbell, and may only get access to the fridge and the music selection. If the owner is home, they can decide whether to let them in.

On a technical level, with Datavard Glue access is restricted to tables to SAP modules, to SAP packages (formerly known as development classes), and even to a subset of data in these packages.

The security concept of Datavard Glue supports several levels of restrictions:

  • Users with access to data need to have valid SAP logons
  • Access to Glue ETL functionality can be restricted to transaction codes
  • Data access can be restricted to SAP modules
  • Data access can be restricted to SAP packages (development classes)
  • Data access can be restricted to individual SAP tables

To achieve this, Datavard Glue uses some standard SAP authorization objects, similar to what the SAP data browser (SE16) does. On top of this, Glue provides some Datavard-specific authorization objects, which can simply be used for user roles and profiles. This way, users can simply restrict the access of Glue users, e.g. to a range of tables belonging to an SAP module. In addition, if the user cannot remember these tables by heart, they can simply leverage the table assignment within the SAP system.