Time series forecasting allows you make confident decisions on time series data by predicting future values based on the historical values. Time series data is data that contains a value over time, for instance revenue by month or call volume by week. In SAP Analytics Cloud you can add easily add a forecast to a time series chart, line chart or planning version.
This blog explains how time series forecasting in SAP Analytics Cloud works and answers some frequently asked questions. This is the first in a series of blogs that will explain what is behind the Augmented BI features in SAP Analytics Cloud.
How does Time Series Forecasting in SAP Analytics Cloud work?
To add a forecast to a chart you simply need to select the forecasting option on the chart as shown. The options available differ based on the chart type and these differences are described when applicable below.
With a time-series, line chart or planning grid, a user can choose between several techniques – Automatic forecasting, Triple Exponential Smoothing and Linear Regression to aid in their decision-making process.
As Linear Regression and Triple Exponential Smoothing are standard well understood techniques we focus on the details of Automatic forecasting here.
The Automatic Forecasting in SAP Analytics Cloud performs advanced statistical analysis to generate forecasts by analyzing trends, fluctuations and seasonality. The forecasting function uses SAP’s proprietary time series technology to analyze historical time series data.
What is the technology behind SAP Analytics Cloud's Automatic Forecast?
The algorithm works by analyzing the historical data to identify the existing patterns in the data and then using those patterns, projects the future values. The data is analyzed for several different components.
- The data is analyzed for an underlying trend, is the data trending up or down over time and is that trend linear or polynomial. A linear trend increases or decreases along a line and in contrast a polynomial trend follows a curve.
- The second component identified is cycles in the data, does the data repeat every 10 days, every 3 months, at Christmas every year or at the then end or each fiscal quarter. More than one cycle can be identified in the data.
- Finally, when the trend and the cycles are accounted for, the remaining data is analyzed to see if a pattern can be found.
The algorithm will not find all three components in the data, quite often one or two of the components will be detected and these will be used to generate forecasts. The algorithm uses the patterns found in the data to predict the values for future periods.
A more technical description of the algorithm is that the signal is decomposed into additive components as follows:
Signal = Trend + Cycles + Fluctuation + Residual
The tool evaluates automatically various combinations of trends, cycles and fluctuations. Eventually it selects the combination that gives the best forecast accuracy. The overall process is represented in the following diagram:
How do I determine the quality of the forecast?
For you to confidently base a decision on a forecast you need to understand the forecast quality. The quality of the forecast is provided in two easily understood ways.
The forecast quality is expressed in terms of a simple 0 to 5 rating where 5 is a very good forecast. The quality can be seen by clicking on the forecast link at the top of chart. In Figure 3 the details of the forecast are shown, and the 5/5 rating can be seen. A natural language explanation of the quality can also be seen by selecting the > to the right of the rating and as you can see this quality is rated as very high.
The 0 to 5 rating is based on a standard statistical quality measure of a forecast known as Mean Absolute Percentage Error or MAPE. The MAPE is expressed as a value between 0 and 1 with a high-quality forecast having a MAPE close to 0. The mapping of MAPE to our rating is as follows:
In addition to the model quality, a confidence interval which corresponds to the upper and lower limit with respect to the forecasted value (shown as a shaded area) is provided.
The forecasted value is a point estimate of the best guess at a given period, for instance Revenue for January. The confidence interval, whose width is twice the standard error of the forecasted values, provides an upper and lower expectation for the forecasted value. By default, the 95% confidence interval is used, allowing the user to be 95% confident that the actual value will lie within the interval.
A narrow confidence indicates a smaller possible range of values around the predicted value and hence a more confident forecast.
What options do I have to customize the Forecast?
You can specify several options when generating a forecast.
There are three algorithms available for Time Series Forecasting in SAP Analytics cloud, Automatic Forecast, Linear Regression and Triple Exponential Smoothing as is shown in Figure 1.
- Automatic Forecast is a process that evaluates several algorithms and models and uses a combined model that performs best as is described above.
- Linear Regression forces the model to be a linear regression on time.
- Triple Exponential Smoothing forces the model to be a Triple Exponential Smoothing model. The parameters that control the algorithm are automatically tuned to minimize the error.
There are two additional forecasting options available: Forecast Periods and Additional Forecast Inputs.
Forecast periods allows you to specify the number of periods for which you wish to generate a forecast. To provide a useful forecast, we limit the number of forecasts available based on the actual data. The number of forecasts offered is calculated as shown.
As an example, if I have monthly sales figures for 48 months the number of forecasts offered is: (48 – 8) ÷ 4 = 10
Additional Forecast Inputs can be used by the Automatic Forecast algorithm to improve the accuracy of the Forecast. They are calculated measures, measure input controls or additional measures from your data model that you want to consider when creating a forecast. The option of using additional inputs is only available for Time Series Charts. Values for selected measures must be available for the historical periods and for the future periods for which you wish to forecast. For example, if I am forecasting my Total expected Revenue and I want to include No of Customer Meetings as shown in Figure 4. I must have historical values for No of Customer Meetings and future values for No of Customer Meetings for the number of forecast periods selected. Additional Forecast Inputs may improve the accuracy of the forecast, but this is not always the case as if the additional inputs do not improve the quality of the model they will not be used.
What data sources are supported?
The data source support varies depending on the feature usage. The following table summarizes this:
The system configuration setting Live Data Models: Enable Smart Grouping and predictive forecasting in Time Series controls enabling forecasting for Time Series Charts and Line Charts on live data connections for a tenant.