This guided tour contains mathematical formulas and/or Greek symbols and are therefore best viewed with Internet Explorer, Opera or Google Chrome, because other web browsers (in particular Firefox) may not display the "Symbol" fonts involved. For example, "b" should be displayed as the Greek letter "beta" rather than the Roman "b". If not, reload this guided tour in Internet Explorer, Opera or Google Chrome, or make one of them your default web browser.

A typical example of a model for which Two-Stage Least Squares (TSLS) is applicable is the first equation of the following system of equations:

*Y*_{2,t} =
G_{1}*X*_{1,t} +
G_{2}*X*_{2,t}
+ *U*_{2,t} ,

*t* = 1,2,3,.....,*n* ,

where
*y*_{1,t}*Y*_{2,t}*X*_{1,t}*X*_{2,t}*Y*_{2,t}.

The equation for *y*_{1,t}

where
*Y _{t}* = (

If the error term *u*_{1,t}*U*_{2,t}*E*[*u*_{1,t}*Y*_{2,t}' ] ¹ 0',

Model
*y*_{1,t} = a'*X*_{1,t} +
b'*Y*_{2,t}
+ *u*_{1,t}
can be written compactly as

where *y* is the vector of stacked variables *y*_{1,t} for
*t* = 1,2,...,*n*, *X* is the matrix with rows
*X*_{1,t}',*Y*_{2,t}' )'*t* = 1,2,...,*n*, *u* is the vector of stacked errors *u*_{1,t} for *t* =
1,2,...,*n*, and
q = (a',b' )'.
Moreover, let *Z* be the matrix with rows
*X*_{1,t}',*X*_{2,t}' )'*t* = 1,2,...,*n*.

As motivated in the previous section, the error vector *u* satisfies

Due to the latter, and some further regularity conditions, the parameter vector
q
can be estimated consistently and asymptotic normally by the Instrumental Variables (IV)
approach, using *Z* as the matrix of
instrumental variables. The IV approach is a special case of the
Method of Moments (MM) approach. As explained in my lecture notes on the
Method of Moments (and in most
intermediate econometrics textbooks as well), the
IV estimator q_{n} of q
takes the form

where

Of course, we have to require that the matrix *X*'*P _{Z}*

The number of variables in *X*_{2,t}*Y*_{2,t}.

Under regularity conditions, the IV estimator q_{n}
is asymptotically normally distributed:

in distribution, where s^{2} is the variance of *u*_{1,t}
and

The IV estimator q_{n} is also called the TSLS estimator
because it can be derived alternatively in the following two steps.

(1) Project linearly the columns of the matrix *X* on the space spanned by the columns on the
matrix *Z*. The linear projection involved is the matrix
*P _{Z}*

(2) Regress *y* on *P _{Z}*

The data have been generated artificially, as

Y1 = Y2 + X1 + X2 + U1

Y2 = X1 + X2 + X3 + X4 + U1 + U2

where X1, X2, X3, and X4 have been drawn independently from the N(0,2) distribution, and U1 and U2 have been drawn independently from the N(0,1) distribution. 500 observations on Y1,Y2,X1, X2, X3, and X4 have been generated this way. The data involved is available as file TSLSDATA.CSV in Excel CSV format (US number setting).

The procedure for the selection of the variables in the TSLS model is similar to OLS, except that now also the instrumental variables have to be selected as X variables:

Next, you have to indicate which explanatory variables are endogenous variables. In this case the only endogenous X variable is Y2:

Now you have to remove at least as many exogenous variables from the list as there are endogenous X variables. The variables to be removed are X3 and X4:

Once you click "Exogenous variables OK" the window changes to:

Click "Continue". Then the output appears:

Recall that the actual data generating process is

Click "Continue". Then the NEXTMENU window appears, which provides further options. These options have already been discussed in the guided tour on OLS estimations, and will therefore not be discussed again.