Predictive analytics – peeking into the future is easier than you think
In this article, I want to show how simple it can be to integrate SAP data with some very basic predictive analytics through Datavard Glue. Why choose this to get started? Simple. A common misconception is that predictive analytics needs vast in-memory computing, is very difficult, costly, and can only be achieved using certain products and vendors.
The brutal truth is, nowadays at least, NONE of this is true! There are beautiful, powerful, and easy to use solutions available which are open source, free to use, safe to use, and easy to use. Let’s call this a PRL, i.e. “Public, Ready to use Library” (which is a term I just made up). Using Glue, it is actually easy to integrate such PRL solutions. In a way, this article is the first of probably many which strives to destroy some myths around the complexity of Big Data. But in a way it is probably time to do so, it is 2019 after all, and what was considered “Big” Data just a few years ago has changed into “Data” by now.
How will your BBQ sets sell next year? Let’s ask SAP ERP
Let’s take a look at a simple scenario: sales forecast using ERP data. I’m not talking SAP BPC or some really good solutions for planning and forecasting here. After all, the goal is to de-mystify somewhat, and base this on components all of our Datavard customers actively use. In SAP ERP, the system keeps track of what was sold to whom in the past. Let’s say there are some sales which are seasonal, like BBQ products. Barbecue and open air parties don’t seem to be very popular in the northern hemisphere in, say, February (obviously, even in summer, there is a dependency on the weather. Let’s ignore this for now, and simply focus on seasonality). Based on this thinking, it should be fairly simple to use the SAP ERP sales statistic data, and forecast next year’s revenue based on past data, trends, and seasonality.
The following picture outlines a basic architecture for such a scenario:
Using Datavard Glue, it is very simple to extract SAP ERP data (including S/4 HANA!), and store it on a platform which allows for using any kind of PRL (which is the totally made up acronym for publicly available, ready to use library!). A good example would of course be Microsoft Azure.
In this example, we’re using Python with the Pandas and the StatsModels library. This is a PRL example which can perform very accurate predictions using Python. While very different than SAP’s own ABAP, Python has a large community, is taught on universities, and scales incredibly well. One challenge remains of course, which is to identify relevant SAP data, and bring this data from SAP to the target platform – and after the “magic” prediction has been triggered, consume results back in SAP.
The secret sauce: 30 lines of code
This picture shows sample data from the SAP info structure 001 in LIS – the Logistics Information System – in the blue line. This line comprises PoS (Point Of Sales) retail sales data from 2016 to 2018, and you can see yearly and weekly dynamics (Mondays in February appear to be the least popular for BBQ):
The red graph shows the prediction for the new year 2019, and is based purely on what is called auto-regression, supervised machine learning, or purely “magic” to business users. Amazingly, and joking aside, this graph is computed without any training or tuning. The Python StatsModels library “simply” does it, and this is a gold mine for Data Scientists and indirectly also for business users. Especially surprising is that for productization this solution requires (next to Datavard Glue for automation, traceability, authorization checks etc) only 30 lines of Python code!
Datavard Glue as the special ingredient for predictive analytics (to spice things up)
This reference architecture for what we at Datavard call “Low Key Prediction” is based on Datavard Glue as ETL and automation solution, Python, and the StatsModel library to provide SAP data to the Python library, and consume its results back in SAP. And, what is best, given the right skills and ingenuity, this can be implemented in mere days – without spending big bucks on AI/ML. For agile developments, stepwise refinement, and discovery scenarios, this is pure gold of course.
The truly amazing thing about this architecture is that with a lean budget and with short development cycles true results can be produced, prototyped, and productized within large companies.
For the curious ones – ebook is available!
In my previous blog posts I was highlighting the technical possibilities of integrating business data from SAP with Big Data platforms, e.g. Hadoop and Azure. Some of the backgrounds for this was taken up and bundled into a small e-Book, which you can download here. The feedback on the recent blog posts was overwhelming – thanks to everybody for taking such a deep interest in what we are doing at Datavard, with our software, at our customers, and in our lab!