Just recently Datavard has successfully migrated the world’s biggest SAP BW system (160 TB) to the SAP HANA database in near-zero downtime. On the next lines, I would like to share my experiences from this extraordinary journey with you.
The largest food company in the world hits the road to HANA
Our customer had a clear program objective – migrate their entire SAP landscape to the HANA database while minimizing the system outage for business users. We took care of their four SAP BW systems, 300 TB in total, with the biggest one storing over 160 TB of data.
Together with our partner SAP, we have implemented a co-innovative approach to upgrade the BW system. It is based on the Datavard Lean System Copy service for BW and the business transformation methodology – more about this you can find in here.
In a nutshell, the idea is to run a tool-based migration of the productive data to a new, empty system running on the desired release and database. Test cycles (typically one to three) happen on a separate hardware (dedicated sandbox) with production-like data to avoid surprises during the actual productive system upgrade. The tool-based migration gives options to migrate data selectively (i.e. keep old unused data on the old system) as well as minimize the business outage window. The development system and quality system can be created in various ways, the most common way is to use a test cycle execution to build the future quality system and use the Lean System Copy service to build the future development system. The standard DMO was not an option because of its limitation regarding the business outage minimization.
The challenges of upgrading 160 TB BW 7.3 system to 7.4 on HANA in under 48 hours
Back to our example. During the program, we have upgraded four BW production systems to HANA, all having size above 70 TB, all having strong requirements towards minimum business impact, initially planned for less than 48 hours, one of them having its data center moved across the ocean. Fortunately, the biggest one (160 TB) was within the same data center, so the network throughput was not a big challenge.
The landscape of the BW system was built as a 4-tier, with development, quality, preproduction and production systems, including data outboarded to NLS. (The way how we have processed data archived to NLS with our OutBoard solution will be described in another blog post.) Due to the hardware costs of a sandbox system, we could not follow our best practice methodology (i.e. use the production system data for our test upgrade runs). Therefore, we have followed the standard development path – starting with development system, continue with quality up to production – what made the project trickier, especially regarding the business outage estimation.
Another challenge was the hardware. At the time, HANA hardware for such a big BW system was simply not available and we wanted to ensure that the box is scalable enough to operate the system in next years as well.
Running the near-zero downtime migration in numbers
Everybody who knows today’s hardware and network speed capabilities understands that we could not migrate all 160 TB at one shot within an outage of 48 hours. Having experience with near-zero downtime approaches from our business transformation scenarios, it was relatively easy to come up with a solution – you need to process part of the data in the uptime (the time before the system outage). Typically, this is done using DB triggers. However, this was not possible in our case, due to the storage requirements and performance impact on the production system.
Therefore, we have used our Datavard Insights solution with its component showing the data usage to give us also a picture of what data is static (i.e. not being changed or changed less often) and dynamic (changed often). We have adjusted the software to also alert us whenever the static part was changed, so that we could reprocess it again. This way, we have processed most of the data in the time before the migration with a minimum impact on the business. The main restrictions during the uptime phases included a transport freeze (only exceptional and very high priority transports could be moved to production), archiving freeze (no new archiving to NLS was allowed) and preferably no cube compression (it wasn’t available at all times), all other functionalities were fully available.
The cutover has started 3 weeks prior the system outage with migrating the static data (the uptime phase), where we have used approximately 25% of the system power, mainly during the quite hours. The dynamic data and changed static data have been processed within 42 hours (the downtime phase). The overall user outage window was up to 88 hours (including our downtime phase). During this time, we have managed to test the complete migration (more than 40,000 test scripts executed in less than 4 hours) as well as switch all interfaces with enabling and validating all loads from all the source systems.
The whole process was closely monitored on the HANA database as well as on the BW application level. Therefore, we have been able to minimize the impact on the users during the uptime phase and maximize the performance by “juicing” the hardware to its limits. We have also used our tools for stress testing the HANA database to tune it along with application parameters to be ready for the productive usage.
The migration would not have been possible without our in-house tools
The successful migration was possible thanks to our know-how from previous projects as well as Datavard software portfolio:
- Lean System Copy service as the foundation for the HANA upgrade. It is a tool-based service for building an empty system copy (typically) of the production system with no master or transactional data. Such system copy is used as a foundation for the future development system as well as a template for the future production system. The empty system can be also upgraded very fast and only once.
- OutBoard DataTiering – a nearline storage archiving solution which achieves 90% data compression rate
- Datavard Insights for identifying which data is static and which is dynamic, as well as real-time monitoring
- Datavard Validate for automated testing, which ran 40,000 test scripts in less than 4 hours
If you have further questions or feedback, feel free to contact us. We will be happy to support you on your way to SAP HANA: