-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path2-Manual_First_Part_Regression.Rmd
115 lines (69 loc) · 3.56 KB
/
2-Manual_First_Part_Regression.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
---
title: "Manual_First_Part_Regression"
author: "Elias Mayer"
date: "19 8 2021"
output:
word_document: default
html_document: default
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
2 Rmarkdown file to execute, code is split between workbooks to enable easier seperated execution.
## Time Series Regression Modeling
```{r Regression set up}
library(Mcomp)
library(tsibble)
library(dplyr)
library(ggplot2)
library(forecast)
```
The data will be fitted to a suitable regression model. This model will then used to create a Forecast which will be compared to an ets model and one arima model, performance wise. We will use the information gathered in the exploratory analysis from part one to define seasonality, trend and period of the model.
In the first step we fit a linear regression model to the data. Predictor variables of the regression model can be used through quarterly dummy variables (to capture seasonality) and a dummy variable for the trend. The summary of the model shows that the second season has no significant predictive value in the model. This can be seen in the P Value. The residuals show that information from the underlying data has not been captured well. The clear bow shape resembles the given trend of the data.
Through the usage of subjective knots which determine changes in the historical data we can fit the model more closely to the data. This method can lead to over fitting. Due to the fact that we work with employment, economical events correlated data, setting knots on peaks is somewhat acceptable. The identified data knots are the years 1980 ( ~ stop increasing movement) and 1990 ( ~ start decreasing). An examination of the residuals that trend was captured to a degree by the model. The ACF plot shows no significant values. We set the model and use it for comparison with the following exponential smoothing and ARIMA model.
Another method in creating regression models is subjectively placing
```{r regression_first_steps}
#prepare Mcomp data set
data_part1 <- M3[[1355]]
data_part1$period
y_train <- data_part1$x
y_test <- data_part1$xx
#create a combined version
ts_bind <- ts.union(y_train, y_test)
ts_bind %>% autoplot() + theme_minimal()
#fit linear reg. model
fit.train_simp <- tslm(y_train ~ trend + season)
summary(fit.train_simp)
#forecast the model
fcast_t <- forecast::forecast(fit.train_simp, h=8)
autoplot(ts_bind) +autolayer(fitted(fcast_t)) +
ggtitle("Forecasts of employment using regression + trend + season") +
ylab("emplyoment") + theme_minimal()
fcast_t %>% checkresiduals()
#without season
h <- 8
t <- time(y_train)
t.break1 <- 1980
t.break2 <- 1990
tb1 <- ts(pmax(0, t - t.break1), start = 1977 )
tb2 <- ts(pmax(0, t - t.break2), start = 1977 )
fit.pw <- tslm(y_train ~ t + tb1 + tb2 + season, lambda="auto")
t.new <- t[length(t)] + seq(h)
tb1.new <- tb1[length(tb1)] + seq(h)
tb2.new <- tb2[length(tb2)] + seq(h)
newdata <- cbind(t=t.new, tb1=tb1.new, tb2=tb2.new) %>%
as.data.frame()
autoplot(ts_bind) +
autolayer(fitted(fit.pw), series = "Piecewise") +
ylab("employment") +
ggtitle("Nonlinear Regression - Forecasts for data set N1355 + trend") +
guides(colour = guide_legend(title = " ")) + theme_minimal()
fit.pw %>% checkresiduals()
summary(fit.pw)
# All forecasts horrible corrected are the worst
```
```{r regression_second_Forecasting, echo=FALSE}
#visualize reg. model
fcasts.pw <- forecast::forecast(fit.pw, newdata = newdata)
ts_bind %>% autoplot() + autolayer(fitted(fcasts.pw), series="Piecewise") + autolayer(fcasts.pw, series="Piecewise") + theme_minimal()
```