Businesses experiencing data growth face a rise in operational costs. To mitigate it, it is necessary to take a closer look at your data. How valuable is it and how often do you actually use it? We explain what you should focus on when considering a near-line storage solution for archiving.
Which Data on my BW system is actually cold?
For a successful near-line storage (NLS) project (see Implementing NLS for SAP BW), the key part of the process takes place before the actual implementation. We are talking about the system-usage analysis, which helps us to determine which InfoProviders (or more often which time-periods of the InfoProviders) are used almost never or rarely and could be safely moved into the NLS.
There is a difference between OLD and COLD: The case for OPERATIONS INTELLIGENCE
When we talk about archiving of InfoCubes or DSOs, there are different criteria for defining ‘usage’ for these types of objects, since both serve different purposes. We call this insight ‘operations Intelligence’.
InfoCube is the central object on which reports and analysis are based in SAP BW. It describes a self-contained data set within a business area, for which we can define queries. Therefore, the single most important KPI for an InfoCube, from the archiving potential point of view, is its usage in queries, i.e. the less queried (the colder), the better for archiving.
But what’s the added value of usage-analysis compared to simply archiving everything older than specified time-period, let’s say, 2 years? Let’s consider the following real-case example:
Fig. 1 shows us the comparison between the distribution of size and respective query usage (measured by no. of query executions) of the year partitions in all of the InfoCubes on a customer system. As we can see, although the majority of queries were run on the most recent data, there is still plenty of reports reading data older than 2 years. We could aim for data older than e.g. 4 years instead, but even then, the usage is quite significant, whereas with even older data we use most of the reasonable archiving potential.
The benefit of focusing on actual usage per InfoProvider instead of bulky archiving of presumably aged data is perhaps more evident when we compare the archiving potential of both approaches.
Fig. 2 shows archiving potential against the respective usage (in cumulative query executions). Archiving using operations intelligence allowed us to focus on rarely or infrequently accessed InfoCubes. As a result, we have not only achieved more space saving potential by moving larger amount of data into NLS, but also kept the overall reporting performance very high, since the reporting on NLS data is responsible only for 1% of total number of query executions on the system.
DSO (DataStore object) or WDSO (write-optimized DataStore) serves as a storage location for consolidated and cleansed transaction data or master data. BW customers normally use DSO/WDSO for further loading of transformed and aggregated data into InfoCubes on which the vast majority of actual reporting is being performed either directly or via MultiProviders. Only occasionally are DSOs used directly in reporting. Another important aspect of DSO usage are Lookups, which are used mainly in start- or end-routines in transformations of DTP (Data Transfer Process) and their purpose is basically to merge data from several InfoProviders, usually DSOs, during load processes higher up the data-stream.
To wrap up, the usage analysis of DSO/WDSO objects should focus on these vital KPIs:
1. DTP source usage
2. Lookup usage
3. Query usage (to a lesser extent)
During our NLS implementation projects, we tend to use an aggregated value of all the three KPIs, which streamline identification of so called ‘low-hanging fruit’ (high gain/low effort), i.e. DSOs with the lowest aggregated value (rarely used or not used at all).
Fig. 3 shows an example of output from our Datavard HeatMapTM software, which our customers use to gain fast and accurate insights to act upon, recognizing the actual usage of InfoProviders on their systems and thus utilizing their current or planned BW NLS solution to the maximum.
Fig. 3 List of DSOs sorted by aggregated value, in descending order
Best candidates for NLS – archiving are therefore InfoCubes least accessed in queries and DSOs with the lowest aggregated value of the three aforementioned KPIs. These objects possess the lowest value for either direct reporting or enablement of reporting, which is of course the ultimate purpose of any productive BW system.