ANALYSIS OF THE EFFECTIVENESS OF SELECTED DEMAND FORECASTING MODELS

Objective– Twomethods of predictionwereproposed in the article, usingsales data. Modelswereidentified and estimated, forecastsweredetermined, theirreliability was verified, and thenvaluesobtained for eachmethodwerecompared. Methodology – The article presents models belonging to two different categories. They are regression function, which is a classic example of cause-and-effect model, and ARIMA model for time-series analysis. Results– The results obtained for both models were satisfactorily described by empirical data, but the regression model is much easier to estimate and does not require complex transformations orcalculations, nor the use of specialized software. In the analyzed case, demand forecasting based on the linear regression model is sufficient and reflects the nature of studied phenomenon. Keywords:forecasting, ARIMA model, linearregression model, demand. JEL classification:  C2, C22


Introduction
Demandforecasting in the enterpriseisusuallyan importantissue, affectingeveryarea of itsfunctioning.It not onlybalances the demand for goods with supply, but alsofacilitatesdecision-making in manyaspects of the supplychain, supportingproducers, suppliers and sellers.Therearemanytypes of forecasts.For the purpose of thisarticle, the classiccause-and-effectmethod was used, i.e. linearregression, as well as ARIMA model for timeseriesstudy.Based on the actual dataprovided by the consideredenterprise, concerning the sale of the company'sflagshipproduct, twomodelswereidentified and estimated, obtainedresultswereverified and theirreliabilityassessed.Finally, obtainedforecastswerecompared.At the request of the company, itremainedanonymous.

1.Research procedure
Forecastsshould be constructed on the basis of dependencies.Theirdefinitionmustbe preceded by ananalysis of the collectedempirical data.First, a visualinspectioniscarried out with the use ofa linegraph (in the case of one-dimensionaltimeseries) and a box plot to identifyuncertainobservations.Thesechartsfor the salesprocessunderconsiderationareshown in Figure 1 and Figure 2. The linegraph (Fig. 1) indicates a clear trend and increase in the value of salesovertime, while the box plot (Fig. 2) does not show the existence of outliers, as confirmed by the Grubbs test, for which the value of empiricalstatisticsturned out to be lowerthan the tablevalueat the significancelevel ofα=0.05.Therefore, thereis no need to interferewith theempirical data.Due to the strongcorrelationbetweensalesvalue and time, the next step was to checkitsstrength and confirm the direction.For thispurpose, a correlationbetweenvariableshasbeencalculated, whoseresultsare presented in Table 1 and Figure 3.
In the analyzedcase, the simplestregression model, i.e. simple linear regression (formula 2) was used.The dependent variable y is a forecast feature, i.e. expected demand for studied goods, while the independent variable x is time: Structural parameters b 0 and b 1 were estimated using Statistica computer program.The obtainedresultsarepresented in Table 2.The standardestimation error is 6.04,whichmeansthat the foreseeable sales values differ from empirical values on average by 6 items.Determination coefficient  2 ,whichmeasures the quality of model-fitting to empirical data, is 99%, which means very good model-fitting.This indicates what part of variability of a dependent variable is explained by the model.Thus, the variability of sales was explained in 98%.According to the aboveresults, the relationbetweenthe quantity of sold items and timecan be described by equation (formula3):  = 0.49797 *  + 90.44528 ± 6.04 (3) whichmeansthat the daily increase in number of sold items is about 0,5.The next step isthe verification of the model.According to results in Table 3, the linearity of regression model isimportant (test probabilityp<0), and the estimatedregressioncoefficientsare alsoimportant.
Another step is to study the distribution of residuals.In a properlyconstructed model, the residualsshould be random and have a normal distribution.The following histogram (Fig. 4) and the graph of residualsnormality (Fig. 5) show thatthisdistributiondeviates from the normaldistribution, whichisconfirmed by the Shapiro-Wilk test, for which the valueW of statisticsat the significancelevelα=0,05 turned out to be statisticallysignificant.The lack of normality of the distribution of residualsresults from dailyfluctuations in sales, whichoscillatearoundthe meanvalue, and thelinearregressionfunctioncannot accuratelyreflect the existingvariability, as illustrated by the graph of forecast and empirical data (Fig. 6).Therefore, the ARIMA model was proposed to compare the effectiveness of predictions.
The use of ARMA modelsislimitedonly to stationaryseries.In caseswherethe analyzedseriesis not stationary, but stationarityisachievable, the ARIMA model can be used (Bielińska,2007).The additionalletter'I' in the nameindicatesthatthe studiedtimeseries was subjected to differentiation in order to obtainstationary form.Parameter d indicates how many such actions should be performed.The estimation of the ARIMA model requiresanappropriateprocedure, named afteritsauthors, the Box and Jenkins methodology, whichis based on the followingstages: identification, estimation, and forecasting.According to the above, the first step of the analysisis to studythe stationarity of a series.Theanalysis of the course of timeseries (Fig. 1) alreadyexcludesa stationarycharacterdue to the existence of a trend, indicating a need to bring the series to stationary form.The ACF autocorrelation function (Fig. 7) and PACF partialautocorrelationfunction (Fig. 8) arealsohelpful in the study of stationarity.The autocorrelationgraphreveals a strongcorrelation of the currentobservation with the previous one, whichindicates the necessity to carry out differentiation with a delayequal to -1.Such a procedurewill not onlyeliminate the trend, but willalsoaffect the stationarity of the series.The results of the variabletransformationarepresented in Figure 9.The analysis of the autocorrelation and partialautocorrelationfunctionisalsohelpful in estimating the parameters of the ARIMA model.Since the value of the timeseriesiscorrelated with itspreviousvalue, as shown in the ACF graph, the analyzedprocessisan autoregression.The order of the autoregressiveprocessisindicated by the PACF function, which for the AR(p) model takesvaluesequal to zero for delaysgreaterthan p (preciselyindicatingthatfragmentaryautocorrelationcoefficients for partialdelaygreaterthan p arestatisticallynot significantlydifferent from zero).Therefore, the surveyedseriesis a series with normalautoregression of at most the second order.
Ananalogousprocedureshould be carried out also for delayedvariable D(-1), due to the factthatremoval of autocorrelation of the higher order oftenrevealscorrelations of a lower order and, for example,a previouslyinvisible seasonalrelation.
The ACF functionafterdifferentiationis shown in Figure 10.The correctness of ACF autocorrelationgraph for a differentiatedvariablewillallow for the nextstage, i.e. estimation.Severalmodelsareproposed, as shown in Table3, whichis a commonprocedure.In most cases, severaldifferentpossibilitiesareproposed in order to make a finalselection of the best ones (based on the analysis of selected criteria such as e.g.significance of model parameters, forecast error or information criteria).

Table 3. Summary of estimation results
Model: (1,1,1) Model: (0,1,2) Model: (0,1,2) Model: (0,1,1) Only two of the above models have all estimated statistically significant parameters.However, the analysis of residuals in bothmodelsshowedthat in the ARIMA (0,1,2) model in the correlogramstillindicatessignificantfunctionvalues, suggestingthatthe distribution of residualsis not normal and thereareunexplaineddependencymodels.However, in the case of the ARIMA (0,1,1) model,such relations have not beenrevealed (Fig. 11 and Fig. 12), whichallows considering the residuals as a process of whitenoise (residualsare not correlated).At the end of the study, twoproposedmodelswerecompared with empirical test observations (Table4), whichwere not used to constructany of them.It turns out thatthe forecasts do not differsignificantly and the predictedvaluesarecharacterized by a small relativeforecast error.The resultsobtained for bothmodelsweresatisfactorilydescribed by empirical data, but the regression model is much easier to estimate and does not requirecomplextransformationsor calculations, nor the use of specialized software.In the analyzedcase, demandforecastingbased on the linearregression model issufficient and reflects the nature of the studiedphenomenon.Linearregressionmodels and ARIMA models are among the short-term forecasting methods, but such predictions must be closely monitored and verified.It is not possible to make clear decisions on their basis; their task is only to support management processes and judicial proceedings on the future values of forecasted phenomena

CONCLUSIONS
Demandforecastsareaneffectivetoolfor supporting the planningprocess in a company.Theircompetent and reasonableusecan be a support for managers in shaping the supplychain, deciding on necessaryorders andschedulingproductiondates.It alsoallows detection and quick response to changes in the market, whichisoften akeyfactorthat determines the future of the wholecompany.Therearemanymethods to describeupcomingphenomena, characterized by a differentdegree of complexity and estimation difficulties.They often require appropriate mathematical software.
The article presents models belonging to two different categories.Theyarethe regressionfunction, whichis a classicexample of a cause-and-effect model, and the ARIMA model for timeseriesanalysis.Theresultsobtained for bothmethodsproved to be satisfactorilyreliable, but the construction of a regression model is much simpler and does not requireanyadditionalassumptions.Therefore, itisworthwhile to try out the simple and equallyeffectivetoolsbeforeusingadvancedtechniques, as itturns out thatthanks to them we canensure not onlyoptimization of results in the company, but alsocorrective action where this is necessary.

Fig. 6 .
Fig. 6.Chart of empirical and forecasted data in regression modelSource: the author's own study.

Table 1 .
Correlation matrix between variables Fig. 3. Scatter plot sales versus timeSource: the author's own study

Table 4 .
Comparison between regression and ARIMA model