## Guided tour on Ordinary Least Squares (OLS) estimation and extensions

This guided tour contains mathematical formulas and/or Greek symbols and are therefore best viewed with Internet Explorer, Opera or Google Chrome, because other web browsers (in particular Firefox) may not display the "Symbol" fonts involved. For example, "b" should be displayed as the Greek letter "beta" rather than the Roman "b". If not, reload this guided tour in Internet Explorer, Opera or Google Chrome, or make one of them your default web browser.

In this guided tour I will explain in detail how to conduct OLS estimation, and in particular how to interpret the estimation and test results.

I have gotten many email from EasyReg users with limited or even no econometric knowledge, asking me elementary econometric questions for which the answers can be found in any decent undergraduate econometrics textbook. This guided tour is intended to answer all the elementary econometric questions you may have. If you still don't understand linear regression, please read a good undergraduate econometric textbook first rather than asking me! I recommend the following reading list.

• Ashenfelter, O, P.B. Levine, and D.J. Zimmerman, Statistics and Econometrics: Methods and Applications (John Wiley)
• Baltagi, B.H., Econometrics (Springer-Verlag)
• Bierens, H.J., Lecture notes:
• Wooldridge, J. M., Introductory Econometrics: A Modern Approach (South-Western College Publishing)

• Intermediate econometrics (requires knowledge of matrix algebra):
• Enders, W., Applied Econometric Time Series (John Wiley)
• Green, W., Econometric Analysis (Prentice-Hall)
• Johnston, J. and J. DiNardo, Econometric Methods (McGraw-Hill)

### The classical linear regression model

The classical linear regression model takes the form

yj = b1x1,j + ... + bk xk,j + uj, j = 1,...,n,

where:

• The yj 's are the dependent variables.
• The xi,j 's are the independent (or explanatory) variables.
• The dependent and independent variables are independent across observations, i.e., they are assumed to come from a random sample.
• The uj 's are the error terms, which are assumed to be normally N(0, s2) distributed, conditional on the xi,j 's, with s2 constant (and finite).
• The bi 's are the model parameters to be estimated, and
• n is the sample size.
If the model contains an intercept, then one of the xi,j 's is equal to 1, say the last one, xk,j:

yj = b1x1,j + ... + bk-1 xk-1,j + bk + uj.

The parameter bk is called the intercept, and the parameters b1,...,bk-1 are called the slope parameters.

In general you should always include an intercept in your model, except if economic theory prescribes otherwise (which is rare). The reason is that

bk = E(yj) - b1E(x1,j) - ... - bk-1E(xk-1,j),

so that bk picks up the effect of the unconditional expectations of the model variables. There is usually no economic reason to believe that bk = 0. Therefore, in the discussion below I will focus on the model with an intercept, i.e., I will assume that xk,j = 1.

The model parameters bi are estimated by minimizing the sum of squared residuals:

RSS = Sj ( yj - b1x1,j - ... - bk xk,j)2 = min b1,...bk Sj ( yj - b1x1,j - ... - bk xk,j)2,

where RSS stands for: Residual Sum of Squares (aka SSR = Sum of Squared Residuals). Moreover, the error variance s2 is estimated by

s2 = RSS / (n - k).

The square root of the estimated error variance, s, is called the standard error of the residuals.

Under some regularity conditions we have:

• bi is distributed N(bi, si2).
• The variance si2 of bi can be estimated by si2, say, where si is called the standard error of bi, such that (bi - bi) / si has a Student t distribution with n - k degrees of freedom.
• If the sample size n is large, then (bi - bi) / si is approximately N(0,1) distributed. However, how large n should be in order for the standard normal approximation to be accurate depends also on k and many other factors. Therefore, there is no exact threshold for n being large, but at least it can be said that n - k < 30 is too small for the normal approximation.
• The statistic ti = bi / si is called the t-value of bi, which has a t distribution with n - k degrees of freedom if bi = 0. If n is large this t distribution can be approximated by the N(0,1) distribution. Moreover, if n converges to infinity then ti converges to plus infinity if bi > 0, and to minus infinity if bi < 0.
• In EasyReg the p-value of bi is the probability P[|U| > |ti|], where U is N(0,1) distributed. Thus, the p-values are based on the standard normal approximation of the t distribution.
• (n - k)s2/s2 has a c2 (Chi-square) distribution with n - k degrees of freedom.
The t-value of bi is used to test the null hypothesis that bi = 0, against the alternative hypotheses that either bi is unequal to zero (two-sided test), or bi < 0 (left-sided test), or bi > 0 (right-sided test), as follows. First, choose a significance level, for example 5%. Let tn-k be a t distributed random variable, with n - k degrees of freedom.
• Two-sided test:
Look up in the table of the t distribution (which you can find in most statistics and econometric textbooks) the value t5% for which P[|tn-k| > t5%] = 0.05. This value t5% is called the critical value. Alternatively, you can let EasyReg find t5%, via Tools > Distribution tools > Continuous distributions. If |ti| > t5% then you reject the null hypothesis that bi = 0, and if not you accept this hypothesis. In the latter case you accept that the corresponding regressor xi,j has no effect on yj.
• Left-sided test:
Look up in the table of the t distribution the critical value t5% for which P[tn-k < t5%] = 0.05. If ti < t5% then you reject the null hypothesis that bi = 0 in favor of the alternative hypothesis that bi < 0, and if not then you accept the null hypothesis. In the latter case you accept that the corresponding regressor xi,j has no effect on yj. Note that if you conduct the left-sided test you implicitly assume that bi > 0 is not possible.
• Right-sided test:
Look up in the table of the t distribution the critical value t5% for which P[tn-k > t5%] = 0.05. If ti > t5% then you reject the null hypothesis that bi = 0 in favor of the alternative hypothesis that bi > 0, and if not then you accept the null hypothesis. In the latter case you accept that the corresponding regressor xi,j has no effect on yj. Note that if you conduct the right-sided test you implicitly assume that bi < 0 is not possible.
The t-statistic ti can be rebuild to test for another null hypothesis than only bi = 0, namely as follows. First, observe that ti = bi / si implies that si = bi / ti. Moreover, we have mentioned that (bi - bi) / si has a Student t distribution with n - k degrees of freedom. Thus, ti(1 - bi / bi) has a Student t distribution with n - k degrees of freedom. Therefore, the null hypothesis bi = b*, say, with b* a given value, can be tested by the t test ti* = ti(1 - b* / bi) in the in the same way as above.

Also the p-value can be used to test the null hypothesis bi = 0 against the alternative hypothesis that bi is unequal to zero, as follows. Select a significance level, say 5%. Then you reject the null hypothesis if the p-value involved is less than 0.05, and you accept it if not. Note that this test is a two-sided test. However, you should not use this p-test if n - k < 30, because then the normal approximation of the t distribution is not accurate enough.

The R2 compares the RSS of the regression model under review with the RSS of the "model"

yj = a + uj.

The latter RSS is called the Total Sum of Squares (TSS). The OLS estimate of a is

a = (1/n)Sjyj,

which is the sample mean of the yj's. Thus,

TSS = Sj(yj - a)2.

The R2 is now defined as

R2 = 1 - RSS / TSS.

Note that the R2 can only be interpreted as a measure of the contribution of the explanatory variables to the explanation of yj if the regression model contains an intercept, as otherwise one would compare apples and oranges. Nevertheless, EasyReg also computes the R2 if the model does not contain an intercept, because otherwise I would get too many emails from EasyReg users asking where the R2 is.

The larger the R2, the better the model fits the data. However, the R2 can be inflated towards its maximum value 1 by adding more explanatory variables to the model. The extreme case is where the number of parameters (including the intercept) is equal to n, so that RSS = 0 and thus R2 =1. The

Adjusted R2 = 1 - [RSS / (n-k)] / [TSS / (n-1)]

corrects the RSS and the TSS for the degrees of freedom, in order to penalize for the inflationary effect of the number of parameters. The corrections are based on the facts that if the model yj = a + uj is correct, then TSS / s2 has a c2 distribution with n-1 degrees of freedom, whereas we have seen before that RSS / s2 = (n - k)s2/s2 has a c2 distribution with n-k degrees of freedom.

Under the null hypothesis that none of the explanatory variables have any effect on yj, in the regression model with an intercept, (TSS - RSS) / s2 has a c2 distribution with k-1 degrees of freedom, and (TSS - RSS) is independent of RSS. In that case

has an Fk-1,n-k distribution with k-1 and n-k degrees of freedom. The statistic F is the test statistic of the "overall" F test of the null hypothesis that none of the explanatory variables matter. This test is a right-sided test: The null hypothesis is rejected if the value of the test statistic is larger than the critical value. Rejection of this hypothesis indicates that at least one of the explanatory variables xi j has a non-zero slope parameter bi. Thus, rejection is good! Note that if the model does not contain an intercept then this F test is not valid, hence EasyReg will not report it.

EasyReg also reports two other tests:

• The Jarque-Bera/Salmon-Kiefer test of the null hypothesis that the model errors uj are N(0,s2) distributed. This test actually tests the joint null hypothesis that the skewness E[uj3] is equal to zero and the kurtosis E[uj4] is equal to 3s4, which hold if the uj 's are N(0,s2) distributed. Under the null hypothesis the test statistic involved has (for large n) a c2 distribution with 2 degrees of freedom. Of course, this is a right-sided test: The null hypothesis is rejected if the value of the test statistic is larger than the critical value.
• The Breusch-Pagan test of homoskedasticity. A regression model is homoskedastic if the model errors have a constant conditional variance, given the regressors, and is said to be heteroskedastic if not. This test tests the null hypothesis that in the regression model with intercept the conditional variance E[uj2 |x1,j,..,xk-1,j] = s2 is constant, against the alternative hypothesis that there exists a non-constant function h such that E[uj2 |x1,j,..,xk-1,j ] = h(g1x1,j + ... + gk-1 xk-1,j + gk). This test is actually a test of the joint null hypothesis that g1 = ... = gk-1 = 0, so that under the null hypothesis, h(gk) = s2. Under this null hypothesis the test statistic involved has (for large n) a c2 distribution with k-1 degrees of freedom. Again, this is a right-sided test: The null hypothesis is rejected if the value of the test statistic is larger than the critical value. In EasyReg the Breusch-Pagan test is only reported for models with an intercept.
Often the Jarque-Bera/Salmon-Kiefer test rejects if the Breusch-Pagan test does, because if the conditional distribution of uj is normal with zero expectation but conditional variance dependent on the explanatory variables, the estimated kurtosis will be larger than the estimated kurtosis in the homoskedastic normal case.

The normality assumption is not crucial if the sample size n is large, because due to law of large numbers and the central limit theorem the OLS estimators will still be approximately normally distributed around the true parameter values. However, heteroskedasticity will render the t-values and p-values invalid, and the OLS estimation method inefficient. The latter means that in the case of heteroskedasticity there exists a better method to estimate the parameters, in the sense that it is possible to estimate the parameters by an alternative method such that the variances of the alternative estimators will be lower than the variances of the corresponding OLS estimators. Which alternative estimation method would be better depends on what is known about the conditional variance E[uj2 |x1,j,..,xk-1,j]. If the conditional variance is a parametric model, you can incorporate it in a maximum likelihood model for yj. See the guided tour on user-defined maximum likelihood. If the conditional variance is proportional to a given function of the regressors without parameters, you can divide all the model variables, including the 1 corresponding to the intercept, by the square root of this function in order to make the model homoskedastic.

If the Breusch-Pagan test rejects the homoskedasticity assumption, it is possible to correct the t-values and p-values for the effect of heteroskedasticity, as shown by:

• White, H. (1980): "A Heteroskedasticity-Consistent Covariance Matrix Estimator, and a Direct Test for Heteroskedasticity", Econometrica 48, 817-838.
This correction gives rise to the Heteroskedasticity-Consistent (HC) t-values and p-values, which are also reported by EasyReg. Thus, if the Breusch-Pagan test rejects, and the sample size n is large, you should base all your tests on the HC t-values and/or p-values rather than on the standard versions of these statistics.

Finally, EasyReg reports the asymptotic standard variance matrix and the asymptotic HC variance matrix of the (bj - bj)'s, times the square root of n. If you do not know what an asymptotic variance matrix is, or how to interpret it, just forget about this.

### Constrained least squares

One of the frequent queries I have gotten is: "How can I estimate a linear regression model under linear parameter restrictions?" I will explain how to do that for the case of a Cobb-Douglas production function:

Q = A.La.Kbeu

where

• Q = output
• L = labor
• K = capital
• A is a constant
• u is the error term, with E[u|K,L] = 0 and E[u2|K,L] = s2.
Taking logs, the model becomes a linear regression model:

lnQ = lnA + alnL + blnK + u,

where lnA is the intercept. Now suppose that you want to estimate this model under the restriction of homogeneity of degree 1, i.e., if both K and L increase with say 10% then so will Q. This condition is equivalent to:

a + b = 1.

Thus, replace a with 1 - b:

lnQ = lnA + (1 - b)lnL + blnK + u.

This model can be reformulated and estimated as an unrestricted linear regression model, as follows:

Y = lnQ - lnL = lnA + b(lnK - lnL) + u = b0 + b1X + u,

say, where

• Y = lnQ - lnL
• X = lnK - lnL
• b0 = lnA
• b1 = b.

### Linear regression models based on time series

If the data comes from time series, the observation index j represents time (or a time period), and is therefore now denoted by t. Thus the linear regression model with intercept is now written as:

yt = b1x1,t + ... + bk-1xk-1,t + bk + ut, t = 1,2,...,n.

However, this is not the only difference with the classical linear regression model. There are a few crucial differences. The first one is that the model variables are no longer independent across the observations t. However, it is necessary to restrict the dependence:

• The model variables yt, x1,t,...,xk-1,t have to be stationary. Loosely speaking, stationarity means that the time series involved have a vanishing memory: yt, x1,t,...,xk-1,t become independent of yt-j, x1,t-j,...,xk-1,t-j if j converges to infinity.
Moreover, the conditions on the model errors ut now become:
• The conditional expectation of ut given x1,t,...,xk-1,t and yt-j, x1,t-j,...,xk-1,t-j for all j > 0 should be zero. Thus, denoting the information set involved by

• Át = {x1,t,..., xk-1,t, yt-j, x1,t-j,..., xk-1,t-j for all j > 0},

we must have that E[ut | Át] = 0.

• The conditional variance E[ut2 | Át] = s2 is constant and finite.
Even if the errors ut are N(0,s2) distributed, conditional on the information set Át, the t-values are no longer exactly t distributed under the null hypothesis that the corresponding parameters are zero, but for large n the normal approximation still applies. Thus, under the above conditions all the large sample results for the classical linear regression model carry over, including the overall F test, the Jarque-Bera/Salmon-Kiefer test, and the Breusch-Pagan test.

Next to the just mentioned tests, in the time series regression case EasyReg also reports the value of the Durbin-Watson (DW) test for first-order autocorrelation of the errors ut. The alternative hypothesis of this test is that

ut = rut - 1 + et for some r satisfying 0 < |r| < 1,

where now E[et | Át] = 0 and E[et2 | Át] = s2, and the null hypothesis is that r = 0. Under the null hypothesis the DW statistic should be close to 2. The DW test is one of the few tests for which EasyReg does not have build in critical values. Thus, in order to conduct the DW test you have to look up the critical values in an econometrics textbook.

Note that the DW test is only valid if the model does not contain lagged dependent variables, i.e., none of the regressors xi t should be equal to yt - m for some m > 0. Nevertheless, even if the model contains lagged dependent variables the DW is reported by EasyReg, because otherwise I will get emails from Easyreg users inquiring why the Durbin-Watson test is not reported. In either case EasyReg will tell you: "A better way of testing for serial correlation is to specify ARMA errors and then test the null hypothesis that the ARMA parameters are zero". I will explain this below. The main reason for reporting the DW test in EasyReg is that this test is one of the classical test in econometrics. Therefore, it is included for historical rather than practical reasons.

The stationarity hypothesis has to be tested separately for each of the time series yt, x1,t,..., xk-1,t, using a variety of unit root and stationarity tests, via Menu > Data analysis > Unit root tests (root 1). If you don't know what a unit root is, please read my lecture notes on unit roots. If after reading these lecture notes you still don't understand what a unit root is and how to test for it, just forget about it.

A typical time series regression model with lagged dependent variable x1,t = yt-1 takes the form

yt = b1yt-1 + b2x2,t + ... + bk-1xk-1,t + bk + ut.

If b1 = 1, then yt is a unit root process, even if the xi,t 's are stationary. This suggests to rebuild the t-value of b1 to test for the null hypothesis b1 = 1. However, since under the null hypothesis the stationarity condition is violated, the rebuild t test statistic involved is under the null hypothesis no longer Student t distributed with n - k degrees of freedom, and even the normal approximation for large n no longer applies. Thus, we cannot test the hypothesis b1 = 1 in the same way as in the classical linear regression model. Again, what you have to do if the OLS estimate of b1 is close to 1 (say larger than 0.9) is to test yt for a unit root, via Menu > Data analysis > Unit root tests (root 1).

## Linear regression analysis in practice

### Model and data

The data I shall use for this demonstration of how to estimate and interpret a linear regression model consist of the annual time series LN[real GNP], LN[real wage], and LN[unemployment] for the USA, which you can download from the EasyReg database. See the guided tour on how to retrieve data from the EasyReg database. The model I will estimate is:

LN[unemployment] = b1LN[unemployment] -1 + b2(LN[real GNP] -1 - LN[real GNP] -2)
+ b3(LN[real wage] -1 - LN[real wage] -2) + b4 + u,

where the negative subscripts indicate the lags.

### Data preparation

Before we can specify and estimate this model, we have to make the differenced variables
• DIF1[LN[real GNP]] = LN[real GNP] - LN[real GNP] -1
• DIF1[LN[real wage]] =LN[real wage] - LN[real wage] -1
via Menu > Input > Transform variables:

Click the "Time series transformations" button. Then the window changes to:

Click the "Difference: x(t) - x(t-m)" button twice (the second time you click this button the default lag m = 1 is chosen). The window changes to:

Double click LN[real GNP] and LN[real wage], and click the "Selection OK" button. Then the window changes to:

If you click "O.K." the transformations involved are added to the data file, and you will jump back to the first window. The "Cancel" button in the first transformation window has now become the "Done" button. Click it. Then you will jump back to the EasyReg main window.

### Model specification

In the EasyReg main window, open Menu > Single equation models > Linear regression models:

Double click the variables LN[unemployment], DIF1[LN[real GNP]], and DIF1[LN[real wage]], and click "Selection OK".

We are not going to select a subset of observations. Thus, click "No" and then click "Continue".

Double click the dependent variable, LN[unemployment], and click "Continue".

This window is only for your information. The only action required is to click "Continue".

EasyReg has automatically selected the other variables as the independent variables. Click "Selection OK".

Now we have to select the lagged dependent and (lagged) independent variables, in the next three windows. These windows only appear if you have declared your data as time series.

We are now done with the selection of the lagged dependent and independent variables. Click "Selection OK".

EasyReg automatically adds the constant 1 for the intercept to the model. Click "Continue".

This window only appears if you have declared your data as time series data. I will assume that the text on the button "I have no idea what you are talking about!" applies to you. Thus, click it.

This is the last step of the model specification. Click "Continue". Then the estimation results will be computed:

If you click "Continue" the module NEXTMENU will be activated with options for further analysis, including the default option to store the output in file OUTPUT.TXT in the EASYREG.DAT subfolder.

However, if you click "Done" the output will not be written to file OUTPUT.TXT. Therefore, click "Done" only if you have made a mistake in specifying the model.

Thus, click "Continue":

### Estimation results

```Dependent variable:
Y = LN[unemployment]

Characteristics:
LN[unemployment]
First available observation = 31(=1890)
Last available observation  = 129(=1988)
First chosen observation = 51(=1910)
Last chosen observation  = 129(=1988)
Number of usable chosen observations: 79
Subsample characteristics:
Minimum value: 1.8232000E-001
Maximum value: 3.2148700E+000
Sample mean:   1.7394800E+000

X variables:
X(1) = LAG1[LN[unemployment]]
X(2) = LAG1[DIF1[LN[real GNP]]]
X(3) = LAG1[DIF1[LN[real wage]]]
X(4) = 1

Model:
Y = b(1)X(1) +.....+ b(4)X(4)  + U,
where U is the error term, satisfying
E[U|X(1),...,X(4)] = 0.

OLS estimation results
Parameters    Estimate    t-value    H.C. t-value
(S.E.)     (H.C. S.E.)
[p-value]  [H.C. p-value]
b(1)         0.7423636     10.907          10.147
(0.06806)       (0.07316)
[0.00000]       [0.00000]
b(2)        -3.1431192     -3.293          -2.591
(0.95457)       (1.21303)
[0.00099]       [0.00957]
b(3)         1.3967263      0.892           0.745
(1.56506)       (1.87592)
[0.37216]       [0.45654]
b(4)         0.5178157      3.911           3.368
(0.13239)       (0.15374)
[0.00009]       [0.00076]

Notes:
1: S.E. = Standard error
2: H.C. = Heteroskedasticity Consistent. These t-values and
standard errors are based on White's heteroskedasticity
consistent variance matrix.
3: The two-sided p-values are based on the normal approximation.

Effective sample size (n):                            78
Variance of the residuals:                    0.13976107
Standard error of the residuals (SER):        0.37384631
Residual sum of squares (RSS):               10.34231885
(Also called SSR = Sum of Squared Residuals)
Total sum of squares (TSS):                  31.68915115
R-square:                                         0.6736

Overall F test: F(3,74) = 50.91
p-value = 0.00000
Significance levels:        10%         5%
Critical values:           2.16       2.73
Conclusions:             reject     reject

Test for first-order autocorrelation:
Durbin-Watson test = 1.883484
WARNING: Since the model contains a lagged dependent variable,
the Durbin-Watson test is NOT valid!
REMARK: A better way of testing for autocorrelation
is to specify AR errors and then test the null
hypothesis that the AR parameters are zero.

Jarque-Bera/Salmon-Kiefer test = 8.159389
Null hypothesis:   The errors are normally distributed
Null distribution: Chi-square(2))
p-value = 0.01691
Significance levels:        10%         5%
Critical values:           4.61       5.99
Conclusions:             reject     reject

Breusch-Pagan test = 3.876958
Null hypothesis:   The errors are homoskedastic
Null distribution: Chi-square(3)
p-value = 0.27506
Significance levels:        10%         5%
Critical values:           6.25       7.81
Conclusions:             accept     accept

Information criteria:
Akaike:       -1.917900621
Hannan-Quinn: -1.869519398
Schwarz:      -1.797043758

If the model is correctly specified, in the sense that the conditional
expectation of the model error U relative to the X variables and all
lagged dependent (Y) variables and lagged X variables equals zero, then
the OLS parameter estimators b(1),..,b(4), minus their true values,
times the square root of the sample size n, are (asymptotically)
jointly normally distributed with zero mean vector and variance matrix:

3.61359955E-01  5.23653940E-01  7.22206358E-01 -6.55265904E-01
5.23653940E-01  7.10732618E+01 -7.25338223E+01 -1.86985499E+00
7.22206358E-01 -7.25338223E+01  1.91054973E+02 -2.09074028E+00
-6.55265904E-01 -1.86985499E+00 -2.09074028E+00  1.36703566E+00

provided that the conditional variance of the model error U is constant
(U is homoskedastic), or

4.17455356E-01  1.10368137E+00  9.37306154E-01 -8.37497515E-01
1.10368137E+00  1.14772086E+02 -1.22726118E+02 -3.67345075E+00
9.37306154E-01 -1.22726118E+02  2.74486770E+02 -1.74631925E+00
-8.37497515E-01 -3.67345075E+00 -1.74631925E+00  1.84366741E+00

if the conditional variance of the model error U is not constant
(U is heteroskedastic).
```

### Interpretation of the estimation results

As you see, the Breusch-Pagan test accepts the homoskedasticity condition at the 10% significance level. Therefore, I will ignore the HC t-values and p-values, and judge the significance of the parameters by the values of their standard t-values and p-values.

The parameter estimate b(3) is not significantly different from zero, at any conventional significance level. This indicates that the variable X(3) = LAG1[DIF1[LN[real wage]]] has no effect on Y = LN[unemployment], although the positive sign of b(3) is in accordance which what you would expect from economic theory.

The parameter b(2) = -3.1431192 is significantly different from zero if tested two-sided, and the negative sign is what you would expect. It is also easy to test the significance of this parameter by a left-sided test, on the basis of the corresponding (standard) p-value 0.00099. Recall that this p-value is computed as P[|U| > 3.1431192], where U is standard normally distributed. Therefore, the left-sided p-value is: P[U < -3.1431192] = 0.00099 / 2 = 0.000495. Thus the left-sided test rejects the null hypothesis b2 = 0 at any conventional significance level.

The parameter b(2) can be interpreted as an elasticity, due to the fact that the derivative of ln(x) is 1/x: d[ln(x)]/dx = 1/x, hence d[ln(x)] = dx / x. For example, if the real GNP growth rate

100 *(real GNP - real GNP- 1) / (real GNP- 1)

increases with, say, 1 percentage point, then in the next period unemployment will decrease with approximately -b(2)% = 3.1431192%, ceteris paribus (= everything else being equal or constant). Note that the latter decrease is relative rather than absolute in unemployment rate points. Thus, the estimated effect of 1 percentage point increase in the real GDP percentage growth rate in period t - 1, i.e.,

100 *(real GNPt - 1 - real GNPt - 2) / (real GNP t - 2) - 100 *(real GNPt - 2 - real GNPt - 3) / (real GNP t - 3) = 1,

on unemployment in period t is:

100(unemploymentt - unemploymentt - 1) / (unemploymentt - 1) = b(2) = -3.1431192.

However, due to the presence of the lagged dependent variable there is also an effect on unemployment in period t + j for j = 1,2,3,... If the above change in the real GDP growth rate is a "once and for all" change, i.e., the growth rate of real GDP remains constant after period t-1, the effect on the growth of unemployment in period t + j is:

100(unemploymentt + j - unemploymentt +j - 1) / (unemploymentt +j - 1) = b(2)(b(1)j) = -3.1431192(0.7423636)j, j = 1,2,3,....,

hence the estimated long-run effect of a once and for all change of the real GDP growth rate with 1 percentage point is:

Sj100(unemploymentt + j - unemploymentt +j - 1) / (unemploymentt +j - 1) = b(2)/(1- b(1)) = -3.1431192/(1 - 0.7423636) = -12.2%,

where the summation is taken over j = 0,1,2,.....

### Options

Once you have estimated the model by OLS, you will have a variety of options for further inference. The options menu opens if you click the "Options" button in the previous window:

In this guided tour I will only focus on those options that are covered by intermediate econometrics textbooks. Thus, I will not discuss the KVB test, the ICM test, and the kernel estimate of the error density.

#### The Wald test

One of the options is the Wald test of linear parameter restriction. This option enables you to test joint hypotheses on the parameters. For example, suppose you want to test that the true values of b(2) and b(3) are both zero. You cannot use the corresponding t-values and p-values for this, because you have to take the covariance of b(2) and b(3) into account as well. The Wald test does that automatically, as follows.

Double click b(2) and b(3), and click "Test joint significance". Then the test results appear:

These test results are also written to the output file.

Next, let us test the hypothesis that the true values b2 and b3 of b(2) and b(3), respectively, add up to zero: b2 + b3 = 0. Thus, click "More tests", double click b(2) and b(3) again, and click "Test linear restrictions". Then the following window appears.

The linear restriction involved has to be entered in the form s(0) = s(1).b(2) + s(2).b(3). Thus, enter 0 1 1, and click "O.K." or hit the enter key.

Click "No more restrictions". Then the test results appear.

The "Back" button brings you back to the "What to do next?" window.

#### ARMA errors

The option "Re-estimate the model with ARMA errors" allows the model errors ut to be an ARMA process. For example, an ARMA(1,1) process has the form

ut = rut - 1 + et - det - 1,

and an AR(1) = ARMA(1,0) process has the form

ut = rut - 1 + et,

where E[et | Át] = 0 and E[et2 | Át] = s2.

In this example I will re-estimate the model with AR(1) errors, in order to test for first-order autocorrelation as an alternative to the Durbin-Watson test.

The coefficient a(1,1) corresponds to r and the coefficient a(2,1) corresponds to d. Since we are going to specify an AR(1) process, double click a(1,1), and click "Specification O.K.".

The RSS will now by minimized by the simplex method of Nelder and Mead. Click "Start SIMPLEX iteration".

It is recommended to restart the simplex iteration until the parameters do not change anymore. Thus, leave "Auto restart" checked and click "Restart SIMPLEX iteration:

Then click "Done with SIMPLEX iteration".

Click "Continue". Then the estimation results appear.

Note that the AR(1) parameter a(1,1) is not significant: The null hypothesis r = 0 is accepted at any conventional significance level. Thus we may conclude that it was not necessary to re-estimate the model with AR(1) errors, but of course you can only find this out by doing it. However, all further options will now involve the model with AR(1) errors. Therefore, in order to undo the re-estimation, you have to estimate the model again by OLS, which I have done.

#### (G)ARCH errors

(G)ARCH stands for (Generalized) AutoRegressive Conditional Heteroskedasticity. For example ARCH(1) errors ut have a conditional variance of the type

st2 = E[ut2 | Át] = a0 + a1ut-12,

which can be written as an AR(1) model:

ut2 = a0 + a1ut-12 + vt,

where E[vt | Át] = 0. Moreover, in the case of GARCH(1,1) errors the conditional variance of ut has the form

st2 = g1st-12 + a0 + a1ut-12,

Note that (G)ARCH is quite different from the alternative hypothesis of the Breusch-Pagan test. Therefore, if the Breush-Pagan test accepts the null hypothesis of homoskedasticity then this does not imply absence of (G)ARCH.

In this example I will re-estimate the OLS model with ARCH(1) errors.

Test of homoskedasticity against ARCH

However, before you re-estimate the model, you better test for ARCH errors first:

The null hypothesis of no ARCH has been accepted.

Re-estimate the model with (G)ARCH errors

Nevertheless, I will re-estimate the model with ARCH(1) errors, in order to show how to do that.

The parameter r(1,1) corresponds to the GARCH(1,1) parameter g1, and the parameter r(2,1) corresponds to the ARCH(1) parameter a1. Thus, double click r(2,1), click "Specification O.K.", and follow the same steps as for the case of AR(1) errors:

The parameter d is the estimate of a0 and the parameter r(2,1) is the estimate of a1. Since r(2,1) is not significant at any conventional significance level, we may conclude that a1 = 0, hence

st2 = E[ut2 | Át] = a0

is constant. Thus, there is no ARCH(1). However, similar to the ARMA error case all further options will now involve the model with ARCH(1) errors. Therefore, in order to undo the re-estimation, you have to estimate the model again by OLS.

Before doing this, let us look at the option "Plot the GARCH variances":

This is the plot of the estimated conditional variances st2 = E[ut2 | Át]. Since we have accepted the null hypothesis of no ARCH one should expect to see a horizontal straight line. However, the estimated coefficient r(2,1) is quite large: r(2,1) = 0.344673, so that what you see here is the plot of the function

st2 = d + r(2,1)ut-12 = 0.093678 + 0.344673ut-12.

#### Plot the fit

The option "Plot the fit" does not need much explanation. The only points worthwhile to mention is that if you move the mousepointer over the picture, the corresponding date is displayed in the title bar of the window, and if you double click on the picture the corresponding date is printed.

#### Write the residuals to the input file

This option is handy if you want analyze the errors further, for example outlier analysis.

#### Information criteria

The Akaike, Hannan-Quinn and Schwarz information criteria are explained here.