\newcommand{\backshift}{\operatorname{B}} \newcommand{\var}{\operatorname{Var}} \newcommand{\expect}{\operatorname{E}} \newcommand{\corr}{\operatorname{Corr}} \newcommand{\cov}{\operatorname{Cov}} \newcommand{\ssr}{\operatorname{SSR}} \newcommand{\se}{\operatorname{SE}}
\newcommand{\mat}[1]{\bm{#1}} \newcommand{\eval}[2]{\left. #1 \right\vert_{#2}}
Problem 1.1
Suppose that simple exponential smoothing is being used to forecast the process , where where are white noise with mean and variance . At the start of period , the mean of the process experiences a transient; that is, it shifts to a new level , but reverts to its original level at the start of the next period . The mean remains at this level for subsequent time periods.
Part (a)
\fbox{\begin{minipage}{.8\textwidth} Find the expected value of the simple exponential smoother
\end{minipage}}
We have a time series
except at which is distributed
where the error terms are zero mean white noise with variance .
The expectation of the smoothed time series $\tilde{y}T$ is given by \begin{align*} \expect(\tilde{y}T) &= (1-\theta)\sum{t=0}^{\infty} \theta^t \expect(y{T-t})\ &= (1-\theta)\left( \sum_{t=0}^{T-t^+1} \theta^t \mu + \theta^{T-t^}(\mu + \delta) + \sum_{t=T-t^-1}^{\infty} \theta^t \mu \right)\ &= (1-\theta)\left( \sum_{t=0}^{\infty} \theta^t \mu + \theta^{T-t^} \delta \right)\ &= \mu + (1-\theta) \theta^{T-t^} \delta. \end{align}
Part (b)
\fbox{\begin{minipage}{.8\textwidth} For , determine the number of periods that it will take following the impulse for the expected value of to return to within of the original level . \end{minipage}} We wish to find such that it is expected to be within of ,
$$ \left\lvert \expect(\tilde{y}_k) - \mu \right\rvert \leq \left\lvert \frac{1}{10} \delta \right\rvert. $$Plugging in the definition of the expectation and simplifying,
Since pulling all positive numbers (or symbols that stand for positive numbers) out of the absolute value function does not change the expression, we may rewrite the above as
Dividing by on both sides,
which may be rewritten as
Taking the logarithm of both sides \begin{align*} k \log \theta &\leq \log \left(\frac{\theta^{t^}}{10(1-\theta)}\right)\ &\leq t^ \log \theta - \log 10 - \log(1-\theta). \end{align*}
Finally, we isolate by dividing by on both sides. However, note that is negative, and so we must flip the inequality,
Letting ,
which simplifies to
We wish to take the \emph{smallest} that is an integer that satisfies the equation. Thus, . Or, in other words, periods after , has an expectation that is within the specified distance of .
Problem 1.2
Let be an AR(1) process with . That is , where are white noise with mean and variance . Also note ’s are independent of .
Part (a)
\fbox{\begin{minipage}{.8\textwidth} Find the autocorrelation function for in terms of and . \end{minipage}}
Observe that
and thus
The autocovariance function for , denoted by , is defined as
$$ γ_{\{W_t\}}(k) = \cov(W_t,W_{t-k}). $$Assuming (we solve directly for variance in that case) and replacing and with their respective definitions yields \begin{align*} γ_{{W_t}}(k) &= \cov((φ - 1) Y_{t-1} + e_t,(φ - 1) Y_{t-k-1} + e_{t-k})\ &= \cov((φ - 1) Y_{t-1},(φ - 1) Y_{t-k-1})\ &= (φ - 1)^2 \cov(Y_{t-1},Y_{t-k-1}). \end{align*}
Observe that $\cov(Y_{t-1},Y_{t-k-1}) = γ_{{Y_t}}(k)$. Since is ,
Thus,
The variance of is given by \begin{align*} \cov(W_t,W_t) &= \cov((φ - 1) Y_{t-1} + e_t, (φ - 1) Y_{t-1} + e_t)\ &= (φ - 1)^2 \cov(Y_{t-1},Y_{t-1}) + \cov(e_t,e_t)\ &= (φ - 1)^2 \frac{\sigma^2}{1-φ^2} + \sigma^2\ &= \sigma^2\left(1 + \frac{(φ - 1)^2}{1-φ^2}\right). \end{align*}
Thus, the autocorrelation function is given by
which simplifies to
Part (b)
In part (a), we found that
$$ \var(W_t) = \sigma^2\left(1 + \frac{(φ - 1)^2}{1-φ^2}\right). $$Problem 1.3
Suppose , where are normal white noise with mean and variance . The process is a stationary AR(1) defined by , where is a zero mean normal white noise process with variance . As usual, in the AR(1) process, assume that is independent of . Assume additionally that $\expect(e_t Z_s) = 0$ for all and .
Part (a)
\fbox{Show that is stationary and find its autocovariance function, .}
To be stationary, must have a constant mean and a a autocovariance that is strictly a function of the lag.
The mean is given by
$$ \expect(Y_t) = \expect(X_t) + \expect(e_t). $$Since is AR(1) with mean , we see that $\expect(Y_t) = 0$, i.e., is a constant zero.
The variance is given by
$$ \var(Y_t) = \var(X_t) + \sigma^2. $$Since is AR(1), its variance is , thus
$$ \var(Y_t) = \sigma_Z^2/(1-φ^2) + \sigma^2. $$The autocovariance of is given by
$$ γ_k = \cov(Y_t,Y_{t-k}) = \expect(Y_t Y_{t-k}) - \expect(Y_t)\expect(Y_{t-k}). $$Since has a constant expectation of zero, this simplies to
$$ γ_k = \cov(Y_t,Y_{t-k}) = \expect(Y_t Y_{t-k}). $$Observe that and
The expectation of is given by \begin{align*} \expect(Y_t Y_{t-k}) &= φ \expect(Y_{t-k} X_{t-1}) + \expect(Y_{t-k} Z_t) + \expect(Y_{t-k} e_t)\ &= φ \expect(Y_{t-k} X_{t-1}) + \expect(Y_{t-k}) \expect(Z_t) + \expect(Y_{t-k}) \expect(e_t)\ &= φ \expect(Y_{t-k} X_{t-1})\ &= φ \expect((X_{t-k} + e_t) X_{t-1})\ &= φ \expect(X_{t-1} X_{t-k} + e_t X_{t-1})\ &= φ \left(\expect(X_{t-1} X_{t-k}) + \expect(e_t X_{t-1})\right)\ &= φ \expect(X_{t-1} X_{t-k}). \end{align*}
Since is AR(1), observe that the autocovariance function for is $γ_{{X_t}}(k) = φ \expect(X_{t-1} X_{t-k})$, which has a closed-form solution \begin{equation} γ_{{X_t}}(k) = \begin{cases} \frac{\sigma_Z^2}{1-φ^2} & k = 0\ φ γ_{{X_t}}(k-1) & k > 0. \end{cases} \end{equation}
Thus, the autocovariance function for is given by \begin{equation} γ_k = \begin{cases} \frac{\sigma_Z^2}{1-φ^2} + \sigma_e^2 & k = 0\ γ_{{X_t}}(k) & k > 0. \end{cases} \end{equation}
Since its autocovariance function is strictly a function of lag and its mean is a constant zero, is stationary. Note that it is not just weakly stationary, but strongly stationary given the normally distributed random errors.
Part (b)
\fbox{\begin{minipage}{.8\textwidth} Show that the process , where , has nonzero correlation only at lag 1 (excluding lag 0, of course!). \end{minipage}}
The autocovariance is given by \begin{align*} γ_{{U_t}}(k) &= \cov(U_t,U_{t-k})\ &= \cov(Y_t - φ Y_{t-1},Y_{t-k} - φ Y_{t-k-1}). \end{align*}
Observe that . Since , we see that
and
Thus,
$$ γ_{\{X_t\}}(k) = \cov(e_t + Z_t - φ e_{t-1}, e_{t-k} + Z_{t-k} - φ e_{t-k-1}). $$If , then $γ_{{X_t}}(k) = \cov(e_t + Z_t - φ e_{t-1}, e_{t-k} + Z_{t-k} - φ e_{t-k-1}) = 0$ since they have no terms in common. If , then \begin{align*} γ_{{X_t}}(1) &= \cov(e_t + Z_t - φ e_{t-1}, e_{t-1} + Z_{t-1} - φ e_{t-2})\ &= \cov(-φ e_{t-1}, e_{t-1})\ &= -φ \var(e_{t-1})\ &= -φ \sigma_e^2, \end{align*} which is the only lag that is non-zero.
Problem 1.4
Suppose that is a zero mean white noise process with variance . Consider: \begin{enumerate} \item[(i)] \item[(ii)] . \end{enumerate}
Part (a)
\fbox{Identify each model as an ARMA(p, q) process; that is, specify and .}
\begin{enumerate} \item We rewrite equation (i),
$$ y_t = 0.80 \backshift y_t − 0.15 \backshift^2 y_t + e_t − 0.30 \backshift e_t. $$
Now, we rewrite it into the form
\begin{align*}
(1 - 0.8 B + 0.15 B^2) y_t &= (1 - 0.3 B) e_t\
-20 (1 - 0.5 B)(1 - 0.3 B) y_t &= (1 - 0.3 B) e_t\
-20 (1 - 0.5 B) y_t &= e_t.
\end{align*}
We see that . Two things should be pointed out. First, assuming is symmetric with zero mean, is distributed the same as . Next, the variance of is .
We let , and thus
where is a zero mean white noise process with variance and is AR(1).
\item We rewrite equation (ii),
Now, we rewrite it into the form
\begin{align*}
(1 - B + 0.5 B^2) y_t &= (1 - 1.2 B) e_t\
0.5 (B - 1 + i)(B - 1 - i) y_t &= (1 - 1.2 B) e_t.
\end{align*}
We see that this is an ARMA(2,1) process.
\end{enumerate}
Part (b)
\fbox{Determine whether each model is stationary and/or invertible.} Time series (i) is AR(1) and is thus invertible. We also know that it is stationary since .
Time series (ii) is ARMA(2,1). Let which has roots and , which both modulus . This is larger than , so it is invertible. Let which has root $0.8\overbar{3}$. Since $|0.8\overbar{3}| < 1$, it is not stationary.
Problem 2.1
The Johnson and Johnson dataset contains quarterly earnings per share for the U.S. company Johnson & Johnson. There are 84 quarters (21 years) measured from the first quarter of 1960 to the last quarter of 1980. To load the dataset, run the following: install.packages(”astsa”); library(astsa). The dataset is under the name jj. Do a log transformation of the original time series before answering the following.
Preliminary analysis
We would like to take a look at a simple plot of the data, prior to any transformations.
library(astsa)
tsdata <- ts(data=jj)
plot(tsdata)
We see that the variance increases over time. The log-transformation will fix this problem, as computed in the following code:
n <- length(tsdata)
A <- exp((1/n)*sum(log(tsdata)))
ys <- A*log(tsdata)
log_j <- log(tsdata)
Part (a)
\fbox{\begin{minipage}{.8\textwidth} Construct a time series plot for the logged data. Comment on overall trend and seasonality variation. \end{minipage}}
We generate the plot with the following R code:
plot(ys)
The data has both seasonality and a (positive) trend.
Part (b)
\fbox{\begin{minipage}{.8\textwidth} Fit the a regression model on the logged data
where if time corresponds to quarter and zero otherwise. Assume is a normal white noise sequence. Report model coefficients estimates. Superimpose the fitted values on the time plot in part (a). Note: you will need to first create a variable for time and quarter. To do that, you may use: t=1:84; qt=as.factor(rep(1:4,21)). \end{minipage}}
We perform the model fitting using the following R code:
t <- 1:n
qt <- as.factor(rep(1:4,(n/4)))
q1 <- qt==1
q2 <- qt==2
q3 <- qt==3
m <- cbind(t,q1,q2,q3,ys)
# fit regression model to data
fit <- lm(ys~t+q1+q2+q3, data=m)
fit2 <- lm(log_j~t+q1+q2+q3, data=m)
qt2 <- as.factor(rep(1:4,(n/4)))
fit3 <- lm(log_j~t+qt)
# better approach:
# fit <- lm(ys~t+qt)
# where qt are the factors (1,2,3,4)
The model coefficients are given by:
summary(fit)
##
## Call:
## lm(formula = ys ~ t + q1 + q2 + q3, data = m)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.8847 -0.2735 -0.0356 0.2553 0.8342
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.508319 0.111529 -22.490 < 2e-16 ***
## t 0.126112 0.001704 73.999 < 2e-16 ***
## q1 0.514570 0.116866 4.403 3.31e-05 ***
## q2 0.599431 0.116803 5.132 2.01e-06 ***
## q3 0.810985 0.116766 6.945 9.50e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3783 on 79 degrees of freedom
## Multiple R-squared: 0.9859, Adjusted R-squared: 0.9852
## F-statistic: 1379 on 4 and 79 DF, p-value: < 2.2e-16
summary(fit2)
##
## Call:
## lm(formula = log_j ~ t + q1 + q2 + q3, data = m)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.29318 -0.09062 -0.01180 0.08460 0.27644
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.8312482 0.0369603 -22.490 < 2e-16 ***
## t 0.0417930 0.0005648 73.999 < 2e-16 ***
## q1 0.1705267 0.0387289 4.403 3.31e-05 ***
## q2 0.1986494 0.0387083 5.132 2.01e-06 ***
## q3 0.2687577 0.0386959 6.945 9.50e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1254 on 79 degrees of freedom
## Multiple R-squared: 0.9859, Adjusted R-squared: 0.9852
## F-statistic: 1379 on 4 and 79 DF, p-value: < 2.2e-16
summary(fit3)
##
## Call:
## lm(formula = log_j ~ t + qt)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.29318 -0.09062 -0.01180 0.08460 0.27644
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.6607215 0.0358430 -18.434 < 2e-16 ***
## t 0.0417930 0.0005648 73.999 < 2e-16 ***
## qt2 0.0281227 0.0386959 0.727 0.4695
## qt3 0.0982310 0.0387083 2.538 0.0131 *
## qt4 -0.1705267 0.0387289 -4.403 3.31e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1254 on 79 degrees of freedom
## Multiple R-squared: 0.9859, Adjusted R-squared: 0.9852
## F-statistic: 1379 on 4 and 79 DF, p-value: < 2.2e-16
In other words, the estimate is given by
The plot of the data with superimosed onto it is given by: