## Guided tour on linear General Method of Moments

### Introduction

Module GMM (General Method of Moments) estimates a system of linear (regression) equations with possibly common parameters. GMM includes seemly unrelated regression (SUR) estimation, and estimation of fixed effect or pooled panel data models. The equations involved may have common or different X variables, and common or different parameters. If the parameters and some of the X variables are different then the model is a SUR model. GMM is more general than SUR in that it allows for a wider set of instrumental variables. SUR estimation only employs the X variables in each equation as instruments.

Panel data is treated by EasyReg as either cross-section data, where the same variables for different time periods are treated as different variables, or as time series data, depending on whether you have either large N/small T or small N/Large T panel data, where N is the number of cross-sections and T is the length of the time series involved.

For example, if N > T, so that your data file is set-up as cross-section data, and if you have K variables for each cross-section j (j=1,..,N) and time period t (t=1,..,T), the EasyReg space delimited data file format is:

```L m (L = K*T, m = missing value code)
Name(1,1)
........
Name(1,T)
........
Name(K,1)
........
Name(K,T)
x(1,1,1) ...... x(1,1,T) ...... x(1,K,1) ...... x(1,K,T)
........................................................
x(N,1,1) ...... x(N,1,T) ...... x(N,K,1) ...... x(N,K,T)

```
and the Excel CSV format is
```
"Name(1,1)",....,"Name(1,T)",....,"Name(K,1)",....,"Name(K,T)"
x(1,1,1),....,x(1,1,T),....,x(1,K,1),....,x(1,K,T)
..................................................
x(N,1,1),....,x(N,1,T),....,x(N,K,1),....,x(N,K,T)```
if Windows uses a dot as decimal delimiter, or
```
"Name(1,1)";....;"Name(1,T)";....;"Name(K,1)";....;"Name(K,T)"
x(1,1,1);....;x(1,1,T);....;x(1,K,1);....;x(1,K,T)
..................................................
x(N,1,1);....;x(N,1,T);....;x(N,K,1);....;x(N,K,T)```
if Windows uses a comma as decimal delimiter, where x(i , j , t) is the data entry of variable j for cross section i and time t, and Name(j , t) is the name of variable j for time t.

If T > N, so that your data file is set-up as time series data, the EasyReg space delimited data file format is:

```
L m (L = K*N, m = missing value code)
Name(1,1)
........
Name(N,1)
........
Name(1,K)
........
Name(N,K)
x(1,1,1) ...... x(N,1,1) ...... x(1,K,1) ...... x(N,K,1)
.......................................................
x(1,1,T) ...... x(N,1,T) ...... x(1,K,T) ...... x(N,K,T)```
where again x(i , j , t) is the data entry of variable j for cross section i and time t, and Name(i , j) is now the name of variable j for cross-section i. Data in Excel format has to be arranged similarly.

Note that missing values in Excel files take the form of a blank space. See the guided tour on Excel files in CSV format.

In the case of fixed effect panel data models you have to transform the Y and X variables in the model by taking first differences, using the 'linear combination' option in the transformation menu in the case of largeN/small T panel data, or the difference transformation in the time series transformation menu in the case of small N/large T panel data, in order to eliminate the fixed effects. Then specify a GMM model of the form:

(1)     y(j,t)-y(j,t-1) = b(1)(x(j,1,t)-x(j,1,t-1)) + ... + b(k)(x(j,k,t)-x(j,k,t-1)) + v(j,t)-v(j,t-1),

where the y(j,t)'s are the dependent variables, the x(j,i,t)'s (i = 1,..,k) are the predetermined variables (possibly including lagged dependent variables), and the v(j,t)'s are the model errors. The equations involved are stacked for t=2,..,T in the case of large N/small T panel data, yielding a system of T-1 equations, and stacked for j=1,..,N in the case of small N/large T panel data, yielding a system of N equations. If the original fixed effect model has a time trend, you should include a common intercept b(0) in these equations. The errors v(j,t) are assumed to be i.i.d. for j = 1,..,N and t = 1,..,T. In the case of pooled panel data models there is no fixed effect, so that there is no need to difference the variables. Apart from this, the GMM model involved is similar to the fixed effect panel data model.

If all the parameters are common in each equation, EasyReg will ask you whether the model is a fixed panel data model, a pooled panel data model, or neither. In the case of panel data models, EasyReg will use the typical structure of the variance matrix of the errors in conducting GMM.

I strongly recommend that you read my lecture note on the Method of Moments first before you use this GMM module!

### A fixed effect static panel data example

I will now demonstrate how to estimate a panel data model, on the basis of artificial data generated by the model

y(i,t) = b.x(i,t) + u(i) + v(i,t), t = 1, ...,T, i = 1,...,N,

where:

• x(i,t) is an exogenous variables
• u(i) is the fixed or random effect
• v(i,t) is the error term
• T = 3
• N = 500
• b = 1
The x(i,t)'s, v(i,t)'s and u(i)'s have been drawn from the standard normal distribution. The data file involved, GMMDATA1.TXT, in EasyReg space delimited text format, is included in this guided tour, so that you can replicate this example. The same data set is also included in Excel (US style) CSV format, as GMMDATA1.CSV.

Once you have imported this data file in EasyReg, click Menu -> Multiple equation models -> Linear general method of moments:

In order to get rid of the fixed effect, take all the variables in first differences (which is already done here. Otherwise, use the transformation module via the EasyReg main window: Click Menu -> Input -> Transform variables). Then the model becomes

y(i,t)-y(i,t-1) = b(x(i,t)-x(i,t-1)) + v(i,t)-v(i,t-1), t = 2,3, i = 1,...,500.

We need to choose all the variables involved in the GMM model. These variables are the dependent variables, y(i,2)-y(i,1) and y(i,3)-y(i,2), the independent variables, x(i,2)-x(i,1) and x(i,3)-x(i,2), and the instrumental variables. In principle we could use the independent variables as instruments, but then the system is just-identified. If we choose x(i,1) and x(i,2) separately as instruments for the equation y(i,2)-y(i,1), and x(i,2) and x(i,3) separately as instruments for the equation y(i,3)-y(i,2), the system becomes over-identified. The latter is preferred, because (loosely speaking) the more moment restrictions, the better the efficiency of the GMM estimators. Moreover, in the over-identified case we can test the validity of the over-identifying restrictions, which is also a test of the validity of the model. Thus, click

• y(i,2)-y(i,1)
• y(i,3)-y(i,2)
• x(i,2)-x(i,1)
• x(i,3)-x(i,2)
• x(i,1)
• x(i,2)
• x(i,3)
and click "Selection OK":

We are not going to choose a subset of observations. Thus, click "No", and then "Continue":

Double-click the dependent variables y(i,2)-y(i,1) and y(i,3)-y(i,2), and click "Continue":

The purpose of this window is to check whether the dependent variables are continuously distributed. If not, you will get a warning message (which you can ignore if you wish). Click "Continue":

EasyReg automatically selects the remaining variables as the independent variables and instruments. Thus click "Selection OK":

This window allows you to include an intercept. Since the data has been declared as cross-section data, the options "Time trend" and "Seasonal dummies" have been disabled. The model variables are in first differences, and therefore there is no intercept. Thus, click "Continue":

The purpose of this window is to check your choices of the model variables. Click "Continue":

The independent variables x(i,2) - x(i,1) and x(i,3) - x(i,2) are selected by setting the coefficients of the other variables to zero. When done, click "Zeros OK". Then the window changes to:

The independent variables x(i,2) - x(i,1) and x(i,3) - x(i,2) have a common coefficient. Therefore, double-click them, and click "Done". If you click the "Oops!" button, this window will be reloaded. The "Done" button will change to a "Continue" button, which you have to click again:

The default selection of the instruments is the same as the selection of the independent variables. In this case I have remove the asterix * in front of Z(1,1) = x(i,1), Z(1,2) = x(i,2), Z(2,1) = x(i,2), Z(2,2) = x(i,3), and added an asterix to the differenced variables, by double-clicking.

Next, click "Instruments OK":

Click "Check identification". Then the window changes to:

In order to conduct optimal GMM estimation and compute the variance matrix of the parameters, EasyReg needs to know whether the errors are in first differences or not, and whether the model is a panel data model. Thus, click the second option, and click "Option OK":

Click "Continue" to conduct GMM estimation, and then scroll down to the estimation and test results:

As you see, the estimate of b (the b(1)) is very close to the true value 1. The Wald test of the over-identifying restrictions accepts the null hypothesis that the instruments are valid.

At this point the results are not yet written to file OUTPUT.TXT. If you click "Cancel" you will jump back to the EasyReg main window, and if you click "Oops!" you will jump back to the first GMM window.

Click "Continue" again. Then module NEXTMENU will be activated, which provides the option to test linear restrictions on the parameters:

The option "Wald test of linear parameter restrictions" is the same as for OLS, and will therefore not be discussed here. See the guided tour on OLS estimation.

### A fixed effect dynamic panel data example

Next, I will demonstrate how to estimate a dynamic panel data model, again on the basis of artificial data generated by the model

y(i,t) = b(1).y(i,t-1) + b(2).x(i,t,1) + b(3).x(i,t,2) + u(i) + v(i,t),
t = 2, ...,T, i = 1,...,N,

where:

• x(i,1,t) and x(i,2,t) are exogenous variables
• u(i) is the fixed or random effect
• v(i,t) is the error term
• T = 4
• N = 500
• b(1) = .5, b(2) = b(3) = 1
The x(i,1,t)'s, x(i,2,t)'s, v(i,t)'s and u(i)'s have again been drawn from the standard normal distribution. The data file involved, GMMDATA2.TXT, in EasyReg space delimited text format, is included in this guided tour, so that you can replicate this example. The same data set is also included in Excel (US style) CSV format, as GMMDATA2.CSV.

In order to get rid of the fixed effect u(i), write again the model in first differences (with the observation index i suppressed):

y(t)-y(t-1) = b(1)(y(t-1)-y(t-2)) + b(2)(x(1,t)-x(1,t-1)) + b(3)(x(2,t)-x(2,t-1)) + v(t)-v(t-1),
t = 3,4.

Thus, we have a system of two equations with common parameters:

1. y(3)-y(2) = b(1)(y(2)-y(1)) + b(2)(x(1,3)-x(1,2)) + b(3)(x(2,3)-x(2,2)) + v(3)-v(2),
2. y(4)-y(3) = b(1)(y(3)-y(2)) + b(2)(x(1,4)-x(1,3)) + b(3)(x(2,4)-x(2,3)) + v(4)-v(3).

As explained in my lecture note on the Method of Moments, the instrumental variables for equations 1 and 2 are, respectively,

1. y(1), x(1,1), x(1,2), x(1,3), x(2,1), x(2,2), x(2,3)
2. y(1), y(2), x(1,1), x(1,2), x(1,3), x(1,4), x(2,1), x(2,2), x(2,3), x(2,4)
because the instrumental variables for equation 1 are uncorrelated with v(3)-v(2) and (due to the feedback) correlated with the other variables at the right-hand side of equation 1, and similarly the instrumental variables for equation 2 are uncorrelated with v(4)-v(3) and correlated with the other right-hand side variables.

Except for the different choice of instruments (compared with the static case), the modelling steps in EasyReg are now the same as before. Therefore, I will only show the model specification and estimation results windows:

If you click "Cancel" you will jump back to the EasyReg main window, and if you click "Oops!" you will jump back to the first GMM window. In both cases the estimation results will not be written to file OUTPUT.TXT.

Click "Continue" again. Then module NEXTMENU will be activated, which provides the option to test linear restrictions on the parameters, and to write the estimation and test results to OUTPUT.TXT.