## Guided tour on VAR innovation response analysis

This guided tour contains mathematical formulas and/or Greek symbols and are therefore best viewed with Internet Explorer, Opera or Google Chrome, because other web browsers (in particular Firefox) may not display the "Symbol" fonts involved. For example, "b" should be displayed as the Greek letter "beta" rather than the Roman "b". If not, reload this guided tour in Internet Explorer, Opera or Google Chrome, or make one of them your default web browser.

### Introduction

In this guided tour I will explain how to conduct vector autoregression (VAR) innovation response analysis, including structural vector autoregression innovation response analysis.

The theory involved in explained in my lecture notes on vector time series and innovation response analysis, but here I will review the main ideas, based on the seminal papers:

• Bernanke, B.S. (1986): "Alternative Explanations of the Money-Income Correlation", Carnegie-Rochester Conference Series on Public Policy 25, 49-100
• Sims, C.A. (1980): "Macroeconomics and Reality", Econometrica 48, 1-48
• Sims, C.A. (1986): "Are Forecasting Models Usable for Policy Analysis?", Federal Reserve Bank of Minneapolis Quarterly Review, 1-16
The starting point is a k-variate Gaussian VAR(p) model:

Xt = c0 + C1Xt-1 + ..... + CpXt-p + Ut ,

where

• Xt = (X1,t, ..... ,Xk,t)' is a vector time series of macroeconomic variables,
• c0 is a k-vector of intercept parameters,
• the Cj are k´k parameter matrices, and
• Ut is the error vector, which is assumed to be i.i.d. k-variate normally distributed with expectation the zero vector, and variance matrix S.
The VAR(p) model involved can be written as

C(L)Xt = c0 + Ut,

where

C(L) = Ik - C1L - ..... - CpL p

is a matrix-valued lag polynomial, with L the lag operator: LXt = Xt-1.

The process Xt is strictly stationarity if det[C(z)] has all its roots outside the complex unit circle. Then C(L) is invertible, i.e., there exist k´k parameter matrices Dj, with D0 = Ik and åj³0DjDj' a finite matrix, such that

C(L)-1 = åj³0DjL j.

Hence, the process Xt has a stationary MA(¥) representation:

Xt = m + åj³0DjUt-j,

where m = (åj³0Dj)c0. Note that E[Xt] = m and Var[Xt] = åj³0DjSDj'.

Since the Ut's are i.i.d. with E[Ut] = 0, it follows now that for m ³ 0:

E[Xt+m|Ut] - E[Xt+m] = DmUt .

The latter is the basis for innovation response analysis, i.e., E[Xt+m|Ut] - E[Xt+m] is the net effect of the innovation Ut on the future values Xt+m of Xt.

### Non-structural VAR innovation response analysis

Sims (1980) proposes to interpret the components of the innovation vector Ut = (U1,t, ..... ,Uk,t)' as policy shocks. The problem however is that the components U1,t, ..... ,Uk,t of Ut are not independent, so that it is unrealistic to assume that a shock in one of these components does not affect the other components. In order to solve this problem, Sims (1980) proposes to rewrite Ut as

Ut = Det,

where D is a lower triangular matrix such that

S = DD'.

Then et is i.i.d. Nk[0,Ik]. The components e1,t, ..... ,ek,t of et are uniquely associated to the corresponding components of Ut. Consequently, we can now interpret e1,t, ..... ,ek,t as the actual innovations, and moreover we may consider them as sequential policy shocks: at time t a shock e1,t is imposed, and after then the next shock e2,t is imposed, etc., up to the last shock ek,t. Then the response of Xt to a unit shock in ej,t is:

E[Xt+m|ej,t = 1] - E[Xt+m] = Dmdj for m = 0,1,2,3,.....,

where dj is column j of D, and thus the response of Xi,t to a unit shock in ej,t is given by

ri,j(m) = E[Xi,t+m|ej,t = 1] - E[Xi,t+m] = di,m'dj for m = 0,1,2,3,.....,

where di,m' is row i of Dm.

### Structural VAR innovation response analysis

A disadvantage of this approach is that economic theory plays a limited role. The only role of economic theory is to determine the order in which the innovation shocks are imposed. This order corresponds to the order in which the macroeconomic variables in Xt are arranged. Therefore, Bernanke (1986) and Sims (1986) propose to set up the VAR(p) model as a system of simultaneous equations:

B.Xt = a0 + A1Xt-1 + ..... + ApXt-p + et,

where et is i.i.d. Nk[0,Ik]. The matrix B represents the contemporaneous relations between the components of Xt. This structural VAR(p) model is related to the non-structural VAR(p) model

Xt = c0 + C1Xt-1 + ..... + CpXt-p + Ut ,

by

Xt = B-1a0 + B-1A1Xt-1 + ..... + B-1ApXt-p + B-1et,

.

Hence, c0 = B-1a0, Cj = B-1Aj for j = 1,..,p, and Ut = B-1et. The latter reads as

B.Ut = et.

Therefore, effectively the matrix B of structural parameters links the nonstructural innovations Ut to the structural innovations et.

The main difference with the non-structural approach is the way the variance matrix S of Ut is decomposed, i.e., instead of writing Ut = Det with D a lower triangular matrix we now have Ut = B-1et, hence S = (B-1)(B-1)' = (B'B)-1, and thus

B'B = S-1.

Given S, and taking into account the symmetry of S, the equality B'B = S-1 is a system of (k + k2)/2 nonlinear equations in the k2 elements of B. Therefore, in order to solve this system, one has to set at least (k2 - k)/2 off diagonal elements of B to zeros, similarly to classical simultaneous equation systems. This is where economic theory comes into the picture: The zeros in B are exclusion restrictions prescribed by economic theory.

Note that even if we reduce the system B'B = S-1 to (k + k2)/2 equations in (k + k2)/2 unknowns, there is no guarantee that there exists a solution, because the equations involved are quadratic. But assuming that we have spread the zeros in B such that a solution exists, the structural innovation response of Xi,t to a unit shock in ej,t is given by

ri,j(m) = E[Xi,t+m|ej,t = 1] - E[Xi,t+m] = di,m'bj for m = 0,1,2,3,.....,

where again di,m' is row i of Dm, and bj is now column j of B-1.

### Estimation and inference

The non-structural VAR(p) model can be estimated by maximum likelihood. Given the maximum likelihood estimators of the coefficient matrices Cj for j = 1,..,p, and the variance matrix S, and the joint normal asymptotic distribution of the parameters therein, it is possible to derive asymptotic standard error of each innovation response ri,j(m). EasyReg International endows the estimated innovation responses involved with one and two times standard error bands based on the asymptotic normal distribution of each estimated innovation response around the true innovation response, in order to determine whether the latter is significantly different from zero. The two-times standard error band corresponds approximately to the pointwise 95% confidence interval of each innovation response. The one-time standard error band corresponds approximately to the pointwise 70% confidence interval.

EasyReg International estimates a structural VAR model in three steps. First, the non-structural VAR is estimated by maximum likelihood. Next, given the estimated variance matrix S and the specification of the matrix B, EasyReg will try to solve the nonlinear equations system B'B = S-1 analytically, and if this is not possible, it will minimize the maximum absolute value of the elements of the matrix B'B - S-1. The latter is a form of method of moments estimation. Finally, using the non-structural parameter estimates and the solution of B as starting values, EasyReg re-estimates the parameters by maximum likelihood. Given these maximum likelihood estimates, the innovation responses and standard error bands are computed in the same way as in the non-structural case.

Note that direct maximum likelihood estimation of the structural VAR model (as some other econometric software packages do) is not advisable because the likelihood function is highly nonlinear in the non-zero elements of B, and therefore you may get stuck in a local maximum.

## VAR innovation response analysis with EasyReg

### The data

The data are taken from the EasyReg database, namely the following quarterly data for the US:

• federal funds rate
• M2 (= money)
• cons.price index (= consumer price index)
• nominal GDP

Since VAR innovation response analysis assumes normal errors of the VAR, and the variables involved are all positive valued, transform them by taking logs, using the 'Transform variables' option via Menu > Input:

• LN[federal funds rate]
• LN[M2]
• LN[cons.price index]
• LN[nominal GDP]

The last three variables are likely nonstationary. Therefore, take them in first differences, using the option Menu > Input > Transform variables > Time series transformations. Then select the variables in Xt = (X1,t,X2,t,X3,t,X4,t)' in the following order:

• X1,t = LN[federal funds rate]
• X2,t = DIF1[LN[M2]]
• X3,t = DIF1[LN[cons.price index]]
• X4,t = DIF1[LN[nominal GDP]]

for t = 1,2,...,142, from quarter 1959.2 (t = 1) to quarter 1994.3 (t = 142).

### VAR model specification

Open Menu > Multiple equations models > VAR innovation response analysis, select the variables in the VAR in the above order, and click "Selection OK". Then the following window appears.

I will not select a subset of observations. Thus click "No" and then "Continue":

This window is only for your information. Click "Continue":

In the introduction above I have discussed only a VAR model with intercept parameter vector c0. However, if Xt (= z(t) in EasyReg) is stationary around a deterministic function of time, i.e., Xt - E[Xt] is stationary, we can still conduct VAR innovation response analysis. The VAR(p) model then takes the form:

Xt = C0d(t) + C1Xt-1 + ..... + CpXt-p + Ut ,

where d(t) is a vector of deterministic functions of time t, and C0 (= B in EasyReg) is the corresponding matrix of coefficients. Note that

E[Xt] = C(L)-1C0d(t).

The default specification of d(t) is d(t) = 1. Other options are d(t) = (1,t)', seasonal dummy variables (only in the case of seasonal data, of course), and Chebishev time polynomials. The latter can be used to capture nonlinear time trends. The Chebishev time polynomials have been used in my paper

• Bierens, H.J. (2000), "Nonparametric Nonlinear Co-Trending Analysis, with an Application to Inflation and Interest in the U.S.", Journal of Business & Economic Statistics 18, 323-337,
which can be downloaded from my web site. Your have to read this paper in order to learn how and whether to use this option. I will not discuss it here.

It is logically impossible that the (transformed) data contain a linear time trend, because that would imply that the expectation of some of the variables involved converge to plus or minus infinity.

Since the data are quarterly data, and because it is not clear whether the data are seasonally adjusted, I recommend to include in first instance seasonal dummy variables, next to the constant 1. Whether seasonal dummy variables are needed can be tested. Thus click "Seasonal dummies":

Note that only three quarterly dummies are included next to the contant 1, because the four seasonal dummies add up to 1 and would therefore be perfectly multicollinear with 1.

Now click "d(t) is OK":

There are various ways to determine the order p of the VAR(p) model

Xt = C0d(t) + C1Xt-1 + ..... + CpXt-p + Ut.

Via this window you can determine p automatically by one of three information criteria:

• Akaike = ln[det(S)] + 2[1/(n-p)].(m+p.k2)
• Hannan-Quinn = ln[det(S)] + 2.[ln(ln((n-p))) / (n-p)].(m+p.k2)
• Schwarz = ln[det(S)] + 2 .[ln(n-p)/(n-p)].(m+p.k2)
where m is the number of parameters in the matrix C0, k is the dimension of Xt, n is the length of the vector time series involved, and S is the estimated variance matrix of Ut. The quantity ln[det(S)] is a measure of the fit of the model, which is penalized by a function of the sample size n and the number of parameters, similarly to the adjusted R2 in OLS. Starting from an upper bound of p (8 in this case), the estimated p corresponds to the minimum value of these criteria. The Akaike criterion is the most conservative of the three. This criterion may give too large a p. The other two criteria are consistent, i.e., the estimated p is equal to the true p with probability converging to 1 if n converges to infinity.

Another way to determine p is through testing the joint significance of the parameters in the matrices Cj. I will consider this later.

I have chosen 8 as the upper bound of p. Now click "p OK":

In view of these results, I have chosen p = 2.

This window is only for your information. Click "Continue".

This window enables you to impose Granger-causality restrictions on the VAR. See my lecture notes on vector time series and innovation response analysis. Granger-causality will be discussed below by a separate example. Thus, click "Continue".

Since each equation in the VAR model

Xt = C0d(t) + C1Xt-1 + ..... + CpXt-p + Ut, t = p+1,...,n,

has the same right-hand side variables, and there are no parameter restrictions imposed, the maximum likelihood estimators of the parameters in the matrices Cj for j = 0,1,...,p are the same as the OLS estimators. Thus, click "OLS estimation" first. After EasyReg is done with OLS estimation, the button "FIML estimation" will be enabled. FIML stands for Full Information Maximum Likelihood. Given the vectors Rt of OLS residuals, the maximum likelihood estimator of the variance matrix S of the VAR error vector Ut is

Sn = (n-p)-1åp+1 £ t £ nRtRt',

which is decomposed as

Sn = DnDn',

where Dn (= L in EasyReg) is a lower triangular matrix. We need FIML in order to compute the variance matrix of the non-zero elements of Dn, which in its turn is needed to compute the standard error bands of the innovation responses. Thus, click "FIML estimation" when it becomes enabled. Then the following window appears.

The variables L(.,.) are the non-zero elements of the lower triangular matrix Dn (= L).

You can now test the joint significance of any subset of parameters of the VAR. First, I have tested the joint significance of the parameters of the seasonal dummy variables: Double-click the seasonal dummies (note that each of the four equations contains three seasonal dummy variables, so that you have to double-click all 12 seasonal dummy variables), and then click "Test joint significance":

The test involved is the Wald test of the null hypothesis that all the coefficients of the seasonal dummy variables are zero. The asymptotic null distribution is c2 with 12 degrees of freedom. Clearly, the null hypothesis involved is not rejected at any conventional significance level.

In view of this result, we may now respecify and re-estimate the VAR without seasonal dummy variables. However, I have not done that.

If you click "Again", you can conduct more tests.

Next, I have tested whether the VAR order p can be reduced from p = 2 to p = 1, by testing whether the 16 coefficients corresponding to the variables with lag 2 (i.e., the elements of the matrix C2) are jointly zero.

Clearly, the null hypothesis involved is rejected. Therefore I will adopt the initial choice p = 2.

Note that this procedure is an alternative way to determine the VAR order p. Given an initial value of p for which you are convinced that the actual VAR order does not exceed this initial value, test whether the elements of the matrices Cj for j = q,...,p (with q ³ 1) in the VAR model

Xt = C0d(t) + C1Xt-1 + ..... + CpXt-p + Ut

are jointly zero, and take as the new p the largest value of q for which this hypothesis is rejected.

Now click "Continue". Then the following windows appears.

Let us conduct non-structural VAR analysis first. After you are done with that, you will return to this window so that you can conduct structural VAR analysis. The same applies the other way around.

### Non-structural VAR innovation response analysis

You have to choose the number of periods ahead (the innovation response horizon) for which you want to display the innovation responses. The minimum value is 10. Here I have chosen 40, so that the innovation responses are displayed over a period of 10 years.

Click "Start" to compute the innovation responses together with their standard errors.

You will have the option to write the numerical values of the innovation responses with their standard errors to the output file OUTPUT.TXT, but in general there is no purpose in doing this. Thus, click "Continue".

The contribution of the innovation in variable i to the h-step ahead forecast error of variable j is the sum of the squared responses of variable j to a unit shock in the innovation of variable i. In this window the relative contributions of each variable i to the forecast error variance of variable j are presented. This procedure is known as "variance decomposition".

Click "Continue".

The solid curve is the response of the inflation rate to a unit shock in the innovation of the log of the federal funds rate. You see that in the first three quarters the response is significantly positive, as the two-times standard error band is above the horizontal axis, and then dies out quickly to zero. This phenomenon is known as the price puzzle. Since the FED raises the federal funds rate in order to curtail inflation, one would expect that the response of inflation to a unit shock in the innovation of the federal funds rate is negative rather than positive.

This picture is the response of the money growth rate to a unit shock in the innovation of the log of the federal funds rate. This pattern is what you would expect: If borrowing money is made more expensive, the demand for money will decrease.

When you click "Done", EasyReg will jump back to the first window of module VAR (where you select the variables).

### Structural VAR innovation response analysis

In this demonstration of structural VAR innovation response analysis I will use the same four variables, in a VAR(2) model. However, I have now excluded the seasonal dummy variables, because their coefficients were not jointly significant. The specification and estimation procedure is similar to the previous case, until you reach the following window:

The matrix A in the description of the structural VAR is a matrix of structural parameters with 1 as diagonal elements, and the matrix C is a diagonal matrix. These matrices are related to the matrix B in the structural model

B.Xt = a0 + A1Xt-1 + ..... + ApXt-p + et,

used in the previous discussion of structural VAR analysis by the equality

B = C-1A.

Recall that the structural model relates the nonstructural VAR errors Ut to the structural VAR errors et by the relationships

BUt = et.

In the following four windows the non-zero elements of each row of B are specified.

The first element on row 1 of B is always non-zero. The remaining three non-zero elements are determined by double-clicking the correponding components of Ut. In this example I will choose the first row of B to be (b(1),0,0,0), hence I will not double-click anything, but just click "Equation OK".

When you click "Equation OK", (0,b(2),b(3),b(4)) will be chosen as the second row of B.

When you click "Equation OK", (b(5),0,b(6),0) will be chosen as the third row of B.

Finally, when you click "Equation OK", (0,0,0,b(7)) will be chosen as the fourth row of B. Then the matrix B is:

```æb(1) 0    0    0   ö
ç0    b(2) b(3) b(4)÷
çb(5) 0    b(6) 0   ÷
è0    0    0    b(7)ø
```

Note that this specification is not intended to be a serious economic specification, but is chosen merely as an example.

EasyReg will now try to solve the equation system B'B = Sn-1 analytically:

What you see here are the equations of the system B'B = Sn-1. The equations indicated by (*) do not involve parameters, because the system is over-identified: there are more equations than unknown b(.)'s.

The non-zero parameters in B are solved analytically. If EasyReg cannot solve the system analytically, then you will likely have an identification problem.

The equations in the previous window indicated by (*) are testable hypotheses. In this example the null hypothesis that these equations hold is rejected, hence we should respecify the matrix B. However, since this is only a demonstration of structural VAR analysis, I will continue.

Click "Method 2". Then the log-likelihood will be maximized using the simplex method of Nelder and Mead, starting from the non-structural parameter estimates and the solutions of the b(.)'s.

Restart the simplex iteration until the log-likelihood does not change anymore:

When you click "Done with Simplex iteration" the following window appears.

The rest of the structural VAR innovation response analysis is now similar to the non-structural case.

## Granger-causality

### Introduction

Consider a bivariate time series process Xt =(X1,t,X2,t)'. As is well-known (or should be well-known), the best one-step ahead forecast of each component Xi,t of Xt is the conditional expectation

E[Xi,t | Xt-1,Xt-2,Xt-3,.....],

i.e., of all functions of the past of Xt, say gi(Xt-1,Xt-2,Xt-3,.....), this conditional expectation yields the smallest mean-square forecast error:

E{Xi,t - E[Xi,t | Xt-1,Xt-2,Xt-3,.....]}2 £ E{Xi,t - gi(Xt-1,Xt-2,Xt-3,.....)}2.

Now suppose that

E[X1,t | Xt-1,Xt-2,Xt-3,.....] = E[X1,t | X1,t-1,X1,t-2,X1,t-3,.....].

Then the past of the process X2,t does not contain information that can be used to improve the forecast of X1,t. If so, it is said that X2,t does not Granger-cause X1,t (called after Clive Granger at UCSD who introduced this causality concept).

If Xt is a VAR(p) process:

Xt = c0 + C1Xt-1 + ..... + CpXt-p + Ut ,

and X2,t does not Granger-cause X1,t, then the matrices Cj for j = 1,...,p are lower-triangular, because the coefficients of the lagged X2,t in the VAR are zero.

### Granger-causality testing in practice

Retrieve two annual time series from the EasyReg database, namely LN[nominal GDP], which is the log of nominal GPD of the US, and LN[Income Sweden], which is the log of national income of Sweden. Then use the transformation option (via Menu > Input > Transform variables > Time series transformations) to transform these time series in first differences, in order to make them stationary:

1. DIF1[LN[nominal GDP]]
2. DIF1[LN[Income Sweden]]
These transformed time series are now growth rates. Plot them (via Menu > Data analysis > Plot time series):

You see that these time series have quite a few similar patterns. The reason is that due to the size of the US economy the US GDP growth rate may be considered as a proxi for the world economic growth rate. Sweden is a small country and its economic performance heavily depends on exports. Therefore, the US GDP growth rate will affect the Swedish national income growth rate, but not the other way around. In other words, one may expect that DIF1[LN[Income Sweden]] does not Granger-cause DIF1[LN[nominal GDP]]. To test this, select these variables in the above order in a VAR:

It is clear from the plot that there is no time trend in these series, and you see in this window that the average growth rates (= sample means) are non-zero. Therefore, include only intercepts in the VAR. Thus, click "d(t) is OK":

I have chosen p = 6 as the initial value of p. The three information criteria all indicate that the actual value is p = 1. Therefore, I have chosen p = 1.

The coefficient of one year lagged DIF1[LN[Income Sweden]] in the equation for DIF1[LN[nominal GDP]] has t-value -0.17, and is therefore not significant. Consequently, the null hypothesis that DIF1[LN[Income Sweden]] does not Granger-cause DIF1[LN[nominal GDP]] is not rejected at any convential significance level.

### VAR innovation response analysis under Granger-causality restrictions

In order to impose this restriction on the VAR(1), click "Respecify VAR", select the two variables again, and choose p = 1:

The VAR(1) involved is now of the form Xt = a0 + A1Xt-1 + Ut, where Xt = (X1,t,X2,t)' with

• X1,t = DIF1[LN[nominal GDP]]
• X2,t = DIF1[LN[Income Sweden]]
The Granger-causality restriction now amounts to specifying the matrix A1 as

```æa  0ö
èb  cø
```

say, which corresponds to the pattern

```1  0
1  1
```

Therefore, click "Column Up", and then "Change pattern":

Click "Continue":

Click "OLS estimation" in order to get initial estimates. Since there are parameter restrictions imposed on the VAR, OLS is no longer efficient. Therefore, after OLS is done, the button "SUR estimation" becomes enabled. SUR stands for Seemly Unrelated Regression. SUR estimation of the restricted VAR(1) model Xt = a0 + A1Xt-1 + Ut involves the following steps:

1. Estimate the variance matrix S of Ut on the basis of the vector of OLS residuals. Denote this variance matrix estimate by Sn,0.
2. Maximize the likelihood function L(a0,A1,S) to a0 and A1, with S replaced by Sn,0.
3. Re-estimate S on the basis of the new estimates of a0 and A1. Denote this estimate by Sn,1.
The new estimates of a0 and A1 are efficient, in the sense that the limited normal distribution of the parameters therein is the same as for the maximum likelihood estimators, but the estimator Sn,1 may not yet be efficient. Therefore, I recommend that you repeat the steps 2 and 3, each time j replacing Sn,j-1 with Sn,j, until these matrices converge:

In each SUR estimation step j the matrix Sn, j is decomposed as as Sn, j = Ln, jLn, j', where Ln, j is a lower triangular matrix, and the maximum absolute deviation of the non-zero elements of Ln, j from the corresponding elements of Ln, j-1 is computed. If the latter gets small enough, continue with full information maximum likelihood estimation. Thus, click "FIML estimation", and then "Continue":

You can now conduct joint significance tests.

Click "Continue", choose "Non-structural VAR", and innovation response horizon = 10:

Since DIF1[LN[Income Sweden]] does not Granger-cause DIF1[LN[nominal GDP]], the innovation responses involved are all zero.

On the other hand, a unit shock in the innovation of DIF1[LN[nominal GDP]] has a significant positive impact on DIF1[LN[Income Sweden]], at least in the first three years after the shock in the innovation of DIF1[LN[nominal GDP]].

### VAR models with exogenous variables: VARX models

In principle it is possible to include exogenous variables in a VAR model, next to determinsitic variables such as trends, and conduct innovation response analysis via EasyReg. How to do that is explained in PDF file VARX.PDF.