In the retail industry, demand forecasting is a hot topic. Supermarkets specifically face both an economic and an ethical problem, as for them each forecasting mistake translates into lost revenue and most importantly food waste. $120B worth of food waste can be saved by optimising inventory levels alone, globally.
We are have been working with data from one of the biggest supermarket chains in Portugal and South America in order to improve the statistical algorithms used for stock prediction.
Demand forecasting gives businesses the ability to use historical data on markets to help plan for future trends. Without accurate demand forecasting, it is close to impossible to have the right amount of stock on hand at any given time.
In a sense, demand forecasting is attempting to replicate human knowledge of consumers once found in a local store. Long ago, retailers could rely on the instinct and intuition of shopkeepers. They knew their customers by name, but, more importantly, they also knew buying preferences, seasonal trends, product affinities and likely future purchases.
Too much merchandise in the warehouse means more capital tied up in inventory, and not enough could lead to out-of-stocks — and push customers to seek solutions from your competitors.
The dataset from a top Portuguese supermarket contains sales data for 12 months, ending in January 2018. There are 1175 different items, 98 store locations in 15 regions and 4 possible assortment types. The stockout rate is 12.1% - meaning that during a week, there’s a 12.1% chance that a store will run out of a given product.
In the graph below, we visualise the sales values for a single item, in two stores, marking the weeks when a stockout occurred.
Next, we want to see how the different data points correlate with each other. In the correlation matrix below, we can observe the impact of the Assortment on the Total sales - the better an item is positioned in the shelf, the more sales it produces.
We have used a few off the shelf methods as benchmarks against our own model.
1. ARIMA models are, in theory, the most general class of models for forecasting a time series which can be made to be “stationary”.
The ARIMA forecasting equation for a stationary time series is a linear (i.e., regression-type) equation in which the predictors consist of lags of the dependent variable and/or lags of the forecast errors. That is:
2. Facebook Prophet is an algorithm for time series forecasting, based on an additive model Trends are fit with yearly, weekly and daily seasonality and it also accounts for holidays. The model is designed for time series with seasonal effects, making it a perfect candidate for our problem. Prophet is robust to missing data and shifts in the trend, and typically handles outliers well.
3. Amazon DeepAR is a supervised learning algorithm for forecasting scalar (that is, one-dimensional) time series using recurrent neural networks (RNN). Classical forecasting methods, such as Autoregressive Integrated Moving Average (ARIMA) or Exponential Smoothing (ETS), fit a single model to each individual time series, and then use that model to extrapolate the time series into the future. In many applications, however, you encounter many similar time series across a set of cross-sectional units. Examples of such time series groupings are demand for different products, server loads, and requests for web pages. In this case, it can be beneficial to train a single model jointly over all of these time series. DeepAR takes this approach, outperforming the standard ARIMA and ETS methods when your dataset contains hundreds of related time series. The trained model can also be used for generating forecasts for new time series that are similar to the ones it has been trained on.
Our model is built in Python using Tensorflow libraries. We use a fully connected Neural Network for predicting our baseline, with Long Short-Term Memory (LSTM) cells.
There are two ways in which we considered to train our model:
Split the whole sales dataset by period: We use everything before a set date (2017-09-10) as training data and everything afterwards as test data. Given the limited amount of data, this approach does not capture special events such as winter holidays.
Take a percentage of all items for the training set (80%) leaving the rest (20%) for the test set.
With this approach, we are unsure if we have a balanced distribution of items in the training set and the test set.
In both scenarios, we take data from Week T alone and we predict sales for Week T+1. We assume that we are unaware if a promotion is coming in the following week (a scenario which is unlikely in the real world).
We measured the model error using Symmetric Mean Absolute Percentage error between the actual sales value and the supermarket’s baseline or our prediction. In both cases, our model fits the total sold items better than the supermarket’s example baseline, as shown in the table below.
|Random split      ||Split by date|
|Neurolabs SMAPE  ||11.1%  ||12.6%  |