Guided tour on nonlinear least squares estimation

This guided tour contains mathematical formulas and/or Greek symbols and are therefore best viewed with Internet Explorer, Opera or Google Chrome, because other web browsers (in particular Firefox) may not display the "Symbol" fonts involved. For example, "b" should be displayed as the Greek letter "beta" rather than the Roman "b". If not, reload this guided tour in Internet Explorer, Opera or Google Chrome, or make one of them your default web browser.

About the CES production function

In this guided tour I will explain in detail how to conduct nonlinear least squares estimation, using the example of a homogenous CES production function. However, before you continue, please read NLLS.PDF first to learn about the CES production function and how to estimate its parameters via EasyReg.

The homogenous CES production function takes the form

CES production function

where Q is output, K is capital, L is labor, and U is an error term satisfying E[U|K,L] = 0. As explained in NLLS.PDF, the CES production function can be rewritten as

CES production function

The data

The data I will use has been artifically generated: 500 replications of ln(K) and ln(L) havebeen drawn independently from the standard normal distribution, and ln(Q) has been generated by ln(Q) = 0.25ln(K) + 0.75ln(L) + U, where the errors U have been drawn independently from the standard normal distribution.The data set now consists of 500 independent observation on the transformed variables ln(Q/L) and ln(L/K). The data file involved, NLLSDATA.TXT, is in EasyReg space delimited data format, and should be treated as cross-section data. Of course, there are no missing values. Thus, the data correspond to a Cobb-Douglas production function, which in its turn corresponds to a CES production function, with parameter values a = 0.25, r = 0 and g = 1.

Recursive build up of the CES production function

With Y = ln(Q/L) as the dependent variable and X(1) = ln(L/K) and X(2) = 1 as the independent variables (see NLLS.PDF for the role of the constant X(2) = 1), the CES production function reads as a nonlinear regression model: Y = g(b,x) + U, where x = (X(1),X(2))', b = (b(1),b(2),b(3))' = (ln(g),r,a)' and

CES production function

EasyReg builds up a nonlinear regression model recursively, starting from the X variables X(1) and X(2), by creating new X variables using linear and/or multiplicative combinations and linear and nonlinear transformations of previously selected or created X variables. See NLTRANS.PDF for a list of all available transformations.

As explained in NLLS.PDF, the transformations for building the CES production function involved are

CES production function

Then the nonlinear regression function g(x,b) is equal to X(13).

Admittedly, this is a primitive programming language, but because it is so primitive its execution is fast.

Estimating the CES production function via EasyReg

Now import file NLLSDATA.TXT in EasyReg, choose Y = ln(Q/L) as the dependent variable and X(1) = ln(L/K) and X(2) = 1 as the independent variables. The constant X(2) = 1 is automatically included by EasyReg.

The procedure for importing data and selecting the dependent and independent variables is the same as in the OLS module. I assume that you have run an OLS regression before, so that you already know how to do that.

The first relevant window page is:

EasyReg Window 1A

In order to make X(3), double click X(2) = 1, and click the "Selection OK" button. Then the window changes to:

EasyReg Window 1B

Since this is not the final model, check the "Don't bother me anymore!" box, and click "Clear".

I recommend to add a comment to the transformation before you continue. Then the window changes to:

EasyReg Window 1C

Click "Done" and then click the "Transformation OK" button.

EasyReg Window 1D

The new X variable X(3) = b(1)X(2) = b(1) {= ln(gamma)} has now been made.

To make X(4) = b(2)X(2) = b(2), repleat this procedure: Double click X(2) = 1 again, click the "Selection OK" button, select "Linear combination", add the comment "rho", and click the "Transformation OK" button. The window changes to:

EasyReg Window 2

To make X(5) = b(3)X(2) = b(3), repleat this procedure: Double click X(2) = 1 again, click the "Selection OK" button, select "Linear combination", add the comment "alpha", and click the "Transformation OK" button. The window changes to:

EasyReg Window 3

To make X(6), double click X(1) and X(4)

EasyReg Window 4A

and click the "Selection OK" button. The window changes to:

EasyReg Window 4B

Double click "Multiply" and click the "Transformation OK" button. Then X(6) = X(1)X(4) is created, and the window changes back to:

EasyReg Window 5

To make X(7) = EXP(X(6)), double click X(6) and click "Selection OK":

EasyReg Window 6A

Double click "EXP(z)" and click "Transformation OK". Then X(7) = EXP(X(6)) is created:

EasyReg Window 6B

It is now pretty straightforward to complete the build up of the CES production function by creating the remaining X variables X(8) through X(13) as specified above:

EasyReg Window 6


We are now done. Click the "Select the final model" button. Then the next window appears:

EasyReg Window 7


If you want to use the same specification once more with this or another data set, it is recommended that you store the model in a template, by clicking the "Store" button. Then the model will be stored as template file TEMPLATE_NLLS.002 in the current sub-folder EASYREG.DAT. (The existing template TEMPLATE_NLLS.001 corresponds to the previous version of this guided tour)

If you run this nonlinear regression again, you will jump to the first window, and the last template file which is compatible with the initial X variables will be loaded.

If you click "Store", the window changes to:

EasyReg Window 8


Click "Model is OK". Then the following window appears:

EasyReg Window 9


Recall that b(2) should be greater or equal to -1, b(3) should be contained in the unit interval [0,1], and b(1) is unrestricted. However, asymptotic theory of nonlinear regression requires that the parameter space is closed and bounded. Therefore, I will confine b(1) to the interval [-100,100], b(2) to the interval [-1,100], and b(3) to the interval [0,1]. After entering the lower and upper bounds involved, the window changes to:

EasyReg Window 10


Click "Bounds OK". Then the following window appears:

EasyReg Window 11

Click "Start", which starts the Nelder and Mead simplex iteration. It is strongly recommended to restart the iteration from different random start values, as well as from the last iteration result, in order to check whether you have reached the global minimum of the objective function (which is the sum of squared residuals).

EasyReg Window 12

Once you are confident that the global minimum has been reached, click "Done". Then the window changes to:

EasyReg Window 13

Finally, click "Continue". Then the estimation results appear.

EasyReg Window 13

Options

Recall that the true parameter values are b(1) = 0, b(2) = 0, and b(3) = 0.25. Except for b(2) the NLLS estimators are close to these values, but b(2) is not significantly different from zero.

In order to test the joint hypotheses b(1)= b(2) = 0 and b(1)= b(2) = 0, b(3) = 0.25, click the "Options" button, which opens the "Options" menu:

EasyReg Window 13B

and click menu item "Wald test of linear parameter restrictions".

EasyReg Window 14


In order to test the joint hypothesis b(1)= b(2) = 0, double click b(1) and b(2), and click "Test joint significance".

EasyReg Window 15


As expected, the joint hypothesis b(1)= b(2) = 0 is not rejected.

In order to test the joint hypothesis b(1)= b(2) = 0, b(3) = 0.25, click "More tests", double click b(1), b(2) and b(3), and click "Test linear restrictions".

EasyReg Window 16


The null hypothesis involved takes the form of three linear equations:

You have to enter the coefficients involved for each equation: Next, click "No more restrictions". Then the test results appear.

EasyReg Window 17


Again as expected, the null hypothesis involved is not rejected.

The "Back" button brings you back to the "What to do next?" window:

EasyReg Window 18


Note that the Wald test results have been appended to the output. When you click menu item "Done" while leaving the check box "Write output to EASYREG.DAT\OUTPUT.TXT when done" checked, you will return to the EasyReg main window, and the output will be appended to file OUTPUT.TXT in the current sub-folder EASYREG.DAT.

The menu item "Write residuals to the input file" is useful if you want to analyse the NLLS residuals further.

If you want to conduct the ICM test of the correctness of the functional form of the model, you have to read and understand the key papers involved first, which you can find here. Because these papers are technically demanding, a demonstration of how to conduct the ICM test is beyond the scope of this guided tour.

To conclude this guide tour, let us see what happens if you click menu item "Compute and plot the kernel estimate of the error density".

EasyReg Window 19


Recall that the error terms U have been drawn from the standard normal distribution. The picture confirms this.

Output


Dependent variable:
Y = ln(Q/L)

Characteristics:
ln(Q/L)
  First observation = 1
  Last observation  = 500
  Number of usable observations: 500
  Minimum value: -3.4043000E+000
  Maximum value:  3.4399900E+000
  Sample mean:    8.1536220E-002

X variables:
X(1) = ln(L/K)
X(2) = 1


NLLS model:
Model: y = g[x,b]  + u, where
X(1)=ln(L/K)
X(2)=1
X(3)=b(1) {= ln(gamma)}
X(4)=b(2) {= rho}
X(5)=b(3) {= alpha}
X(6)=X(1).X(4)
X(7)=EXP[X(6)]
X(8)=X(7)-X(2)
X(9)=X(5).X(8)
X(10)=LOG[X(9)+1]/X(9)
X(11)=(EXP[X(6)]-1)/X(6)
X(12)=X(1).X(5).X(10).X(11)
X(13)=X(3)-X(12)
g[x,b] = X(13), where x = (X(1),..,X(13))' and b = (b(1),..,b(3))'

-100 <= b(1) <= 100
-1 <= b(2) <= 100
0 <= b(3) <= 1

The objective function (RSS) has been minimized using the simplex method
of Nelder and Mead. The algorithm involved is a Visual Basic translation
of the Fortran algorithm involved in:
W.H.Press, B.P.Flannery, S.A.Teukolsky and W.T.Vetterling, 'Numerical
Recipes', Cambridge University Press, 1986, pp. 292-293

Estimation results:
Parameters  Estimate   t-value H.C. t-value(*)
                     [p-value]  [H.C. p-value]
b(1)       -0.015548    -0.291          -0.287
                     [0.77099]       [0.77418]
b(2)       -0.236400    -1.440          -1.431
                     [0.14991]       [0.15248]
b(3)        0.265879     7.744           8.134
                     [0.00000]       [0.00000]

(*) Based on White's heteroskedasticity consistent variance matrix.
[The two-sided p-values are based on the normal approximation]

RSS:                490.212216
s.e.                  0.993148
R-square:               0.1303
n:                         500



Wald test:

b(1)       -0.015548    -0.291          -0.287(*)
b(2)       -0.236400    -1.440          -1.431(*)
b(3)        0.265879     7.744           8.134
(*): Parameters to be tested

Null hypothesis:
b(1) = b(2) = 0

Wald test:                              2.39
Asymptotic null distribution:  Chi-square(2)
  p-value = 0.30228
  Significance levels:        10%         5%
  Critical values:           4.61       5.99
  Conclusions:             accept     accept

Test result on the basis of the heteroskedasticity consistent variance
matrix:
Wald test:                              2.45
Asymptotic null distribution:  Chi-square(2)
  p-value = 0.29400
  Significance levels:        10%         5%
  Critical values:           4.61       5.99
  Conclusions:             accept     accept

Wald test:

b(1)       -0.015548    -0.291          -0.287(*)
b(2)       -0.236400    -1.440          -1.431(*)
b(3)        0.265879     7.744           8.134(*)
(*): Parameters to be tested

Null hypothesis:
1.b(1)+0.b(2)+0.b(3) = 0.
0.b(1)+1.b(2)+0.b(3) = 0.
0.b(1)+0.b(2)+1.b(3) = 0.25

Null hypothesis in matrix form: Rb = c, where
R =
 1. 0. 0.
 0. 1. 0.
 0. 0. 1.
and c =
   0.
   0.
 0.25
Wald test on the basis of the standard variance matrix:
Wald test statistic:                    3.94
Asymptotic null distribution:  Chi-square(3)
  p-value = 0.26828
  Significance levels:        10%         5%
  Critical values:           6.25       7.81
  Conclusions:             accept     accept
Wald test on the basis of White's heteroskedasticity consistent 
variance matrix:
Wald test statistic:                    3.65
Asymptotic null distribution:  Chi-square(3)
  p-value = 0.30232
  Significance levels:        10%         5%
  Critical values:           6.25       7.81
  Conclusions:             accept     accept

Estimating a Logit model by nonlinear least squares

Consider the Logit model P[Y = 1|X] = F(a + bX), where F(x) = 1/(1 + exp(-x)) is the logistic distribution function. The best way to estimate this model is by maximum likelihood (ML), but since E[Y|X] = F(a + bX), we can also estimate this model by nonlinear least squares (NLLS), although the NLLS estimates of a and b are less efficient than the ML estimates.

The Logit NLLS regression only requires two transformation, the linear combination transformation and the logistic transformation, and therefore it is feasible to display all the steps.

The data for Y and X has been generated for parameter values a = b = 1, where the X variables have been drawn from the standard normal distribution. The sample size is 500. The data file involved, in EasyReg space delimited text format, is available as LOGITDATA.TXT.

After importing this data file in EasyReg, the NLLS module opens with:

EasyReg NLLS Logit window 1


Double-click both variables, and then click "Selection OK". Then the following window appears.

EasyReg NLLS Logit window 2


I am not going to use a subsample. Thus, click "No" and then "Continue".

EasyReg NLLS Logit window 3


Double-click the dependent variable Y, and click "Continue".

EasyReg NLLS Logit window 4


This window is just for your information. Click "Continue".

EasyReg NLLS Logit window 5


By default, EasyReg automatically selects all the other variables as regressors. Click "Selection OK".

EasyReg NLLS Logit window 6


EasyReg automatically adds the constant 1 to the data, which you may need in the construction of the nonlinear regression model. Click "Continue".

EasyReg NLLS Logit window 7


This window is just for your information. Click "Continue".

EasyReg NLLS Logit window 8


The first transformation is the linear transformation of 1 and X. Thus, double click 1 (= X(2)) and X (=X(1)) in that order, and then click "Selection OK".

EasyReg NLLS Logit window 9


Double-click "Linear transformation". Then the window changes to:

EasyReg NLLS Logit window 10


To get rid of the annoying window "About the last transformation", check "Don't bother me anymore" and click "Clear".

EasyReg NLLS Logit window 11


Note that b(1) = a and b(2) = b. Click "Transformation OK".

EasyReg NLLS Logit window 12


Since the new variable X(3) depends on parameters, it is a potential candidate for your model. However, if you select X(3) as the final model then you actually specify a linear probability model, which in general is bad econometrics depite the attention that this model gets in most undergraduate econometrics textbooks. What is needed here is to transform X(3) by the logistic transformation. Thus, double-click X(3) and then click "Selection OK".

EasyReg NLLS Logit window 13


Double-click Logit and then click "Transformation OK".

EasyReg NLLS Logit window 14


Now X(4) is the Logit model. Click "Select the final model". Then the last variable is automatically selected as your nonlinear regression model.

EasyReg NLLS Logit window 15


Click "Model is OK".

EasyReg NLLS Logit window 16


In this window you have to specify the parameter space. I will choose [-10,10] for both parameters.

EasyReg NLLS Logit window 17


Click "Bounds OK".

EasyReg NLLS Logit window 18


Click "Start".

EasyReg NLLS Logit window 19


I recommend to restart the simplex iteration with "Auto restart ..." checked, because you may not yet have reached the minimum of the objective function. Once the parameter estimates do not change anymore, uncheck "Auto restart ...", or click the button "Interrupt simplex iteration" (the latter is only shown during the simplex iteration with Auto restart on). Then click "Done".

EasyReg NLLS Logit window 20


Click "Continue". Then the estimation results appear.

EasyReg NLLS Logit window 21


Finally, note that since the Logit regression model has heteroskedastic errors, you should only look at the Heteroskedasticity Consistent (HC) t and p values.

Output


Dependent variable:
Y = Y

Characteristics:
Y
  First observation = 1
  Last observation  = 500
  Number of usable observations: 500
  Minimum value: 0.0000000E+000
  Maximum value: 1.0000000E+000
  Sample mean:   6.8000000E-001
  This variable is a zero-one dummy variable. 
  A discrete dependent variable model (Probit/Logit) is more suitable!

X variables:
X(1) = X
X(2) = 1


NLLS model:
Model: y = g[x,b]  + u, where
X(1)=X
X(2)=1
X(3)=b(1)+b(2).X(1)
X(4)=Logit[X(3)]
g[x,b] = X(4), where x = (X(1),..,X(4))' and b = (b(1),b(2))'

-10 <= b(1) <= 10
-10 <= b(2) <= 10

The objective function (RSS) has been minimized using the simplex method
of Nelder and Mead. The algorithm involved is a Visual Basic translation
of the Fortran algorithm involved in:
W.H.Press, B.P.Flannery, S.A.Teukolsky and W.T.Vetterling, 'Numerical
Recipes', Cambridge University Press, 1986, pp. 292-293

Estimation results:
Parameters Estimate   t-value H.C. t-value(*)
                    [p-value]  [H.C. p-value]
b(1)       0.886834     7.972           7.932
                    [0.00000]       [0.00000]
b(2)       0.887818     7.058           7.325
                    [0.00000]       [0.00000]

(*) Based on White's heteroskedasticity consistent variance matrix.
[The two-sided p-values are based on the normal approximation]

RSS:                 93.125207
s.e.                  0.432433
R-square:               0.1441
n:                         500


This is the end of the guided tour on nonlinear least squares estimation