The approach which I used towards solving for the case consisted of regression runs and decomposition forecasting. The given data consisted of non aligned months, missing observations and billing issuance dates variability. The utility usage is expressed in units of consumption so inflation is not taken into account in the modeling through regression. The break up of the technique that I used is given in steps as under.1)Cleaning the data2)Modeling for Natural Gas Usage3)Forecasting of Usage in June, July & August4)Modeling for Electricity Usage5)Forecasting of Usage in June, July & August6)Classical Decomposition Forecasting7)Comparing of Regression Forecasting & Decomposit ...view middle of the document...
b) Aligning the DataThe observation periods were not perfectly aligned. The gas consumption observation for the month reflects the period from December 18, 1990 through January 11, 1991, while the electricity observation reflects the period from December 12, 1990, through January 11, 1991.The values of variable V3, V11 and V12 were adjusted to align the data on one date. The data was move forward on the 1st of next month by using the following formula.((V31*(V21-17))/V21) + ((17*V32)/V22) [For calculating V3 Natural Gas]((V171*(V21-17))/V21) + ((17*V172)/V22) [For calculating V17 Natural Gas]((V121*(V21-17))/V21) + ((17*V122)/V22) [For calculating V12 Natural Gas]These formulas are for the first values of the model to align these values on the 1st of the month. All the 81 values were aligned in the same way for the natural gas and electric usage. The difference in electric usage formulas was only in the dates where seventeen was replaced with 12.2) Modeling for Natural Gas UsageCorrelation for Natural gas was done by selecting MEAN (V5, 4) as the dependent Variable and V2, V13, V3NG, V11NG, V12NG, MEAN (V4, 4), MEAN (V6, 4) and V10. Out of these six independent variables MEAN (V6, 4) and V10 were not significant. Then regression model was run which gave an R square value of 0.978 and three variables were not significant as the p value was greater than 0.05. Regression was run again and again until all the independent variables were significant. At the end V2, V13, V3NG, V11NG, V12NG, MEAN (V6, 4) were significant with an R square value of 0.845.Case Wise DiagnosticValues were cleaned by deleting the outliers with the help of regression case wise diagnostic. Two standard deviations were selected in the statistics so that the outliers that are outside the two standard deviations should be displayed by regression. Case wise diagnostic was run eight times and 19 values were deleted from the data. MEAN (V6, 4) was also removed from the independent variables and achieved an R square value of 0.976.Eliminated53616735374970255565385854313456731Equation of Model: Y = 150.054- 2.626 V3 NG + 1.95V11NG + .310 V12NG+ 32.89V13Model Summary (b)ModelRR SquareAdjusted R SquareStd. Error of the Estimate1.988(a).976.97415.616a) Predictors: (Constant), V13, V3 NG, V12 NG, V11 NGb) Dependent Variable: MEAN (V5, 4)ANOVA (b)Model Sum of SquaresdfMean SquareFSig.1Regression512399.6594128099.915525.288.000(a)Residual12437.18051243.866Total524836.83955a) Predictors: (Constant), V13, V3 NG, V12 NG, V11 NGb) Dependent Variable: MEAN (V5, 4)Coefficients (a)Model Unstandardized CoefficientsStandardized CoefficientstSig.BStd. ErrorBeta1(Constant)150.05436.273 4.137.000V3 NG-2.626.573-.394-4.581.000V11 NG.195.020.7589.993.000V12 NG.310.044.2496.966.000V1332.8845.016.1436.556.000a) Dependent Variable: MEAN (V5, 4)F-Test for over all Significance:H0: β1 = β2 = β3 = 0Ha: One or more of the parameters is not equal to zeroTest Statistics:F = MSR / MSEReject...