According to the Datavard survey, the average SAP system grows 30-40% per year. In addition to this constant data growth and exponentially raising costs, SAP systems are often experiencing slow performance. System growth stops SAP BW customers to migrate their SAP systems to HANA. While the existing SAP HANA customers are unable to extend their current SAP HANA boxes.
To overcome the challenge with data growth, the SAP recommended data management strategy is to offload data using Nearline Storage. Traditionally SAP recommended to archive data into SAP IQ. But there is a new player on the market, featuring new temperature based concept and the new trends for offloading data – Hadoop is the new storage for warm and cold SAP data. Hadoop brings the perfect balance between low storage costs, easy extendibility and fast performance.
Which Hadoop Distribution Is the Best?
There are multiple storage options with Hadoop. In fact, using the Datavard OutBoard Nearline Storage you can choose between different Hadoop storage options depending on the value and age of the data. Basic option is HDFS good for cold data that is rarely accessed. Other options such as Impala, Hive or Spark would be better for often accessed data – WARM data. Based on several NLS implementations of Nearline Storage on Hadoop the recommended option is Hadoop on Impala or Hive. Please check a good article from Goetz Lessmann about Benchmarking of NLS solutions to compare the costs and speed of different NLS options.
Option 1#: Temperature based Nearline Storage on Hadoop (Recommended)
Option 2#: Temperature based Nearline Storage with IQ and Hadoop
Now let’s see how to set up Nearline Storge archiving into Hadoop in three easy steps:
STEP #1: Connect Hadoopto SAP system using HTTP RFC connection
In this example, we are adding a Cloudera Hadoop. Go to transaction SM59 and add a new RFC connection of type “G – HTTP Connection to External Serv.”
STEP #2: Define a new Archive storage in Storage Management
In OutBoard for Analytics Storage Management we defined a new Storage Type for Hadoop. In addition, we defined also temperature based data tiering. That means that the HOT data stays in SAP BW (HANA), archived data < 5 years will sit in HADOOP IMPALA, archived data < 15 years in HADOOP HDFS and then automatically deleted. OUTBOARD is delivered as an Add-on and extends the standard functionality of the SAP NLS Interface.
STEP #3: Assign Ageing profile to InfoProvider and you are ready to archive into Hadoop
Once the temperature based ageing profile is defined all we needed to do is to assign it to the InfoProvider. Then you are ready to archive into Hadoop NLS. Go to RSA1, create Data Archiving Process and archive.
Benefits of NLS archiving into Hadoop
To overcome the challenge with SAP system growth SAP has recommended to offload data using Nearline Storage. Traditionally SAP suggested to archive data into SAP IQ. Although with the new temperature based concept and the new trends Hadoop is the new storage for warm and cold SAP data. Hadoop brings the perfect balance between low storage costs, easy extendibility and fast performance.
Datavard OutBoard for Analytics Add-on is the only Hadoop certified NLS solution (by Cloudera).