This guided tour contains mathematical formulas and/or Greek symbols and are therefore best viewed with Internet Explorer, Opera or Google Chrome, because other web browsers (in particular Firefox) may not display the "Symbol" fonts involved. For example, "b" should be displayed as the Greek letter "beta" rather than the Roman "b". If not, reload this guided tour in Internet Explorer, Opera or Google Chrome, or make one of them your default web browser.

The Tobit model is based on the following latent variable model:

*Y*^{*} = b'*X* + *U*,

The latent variable *Y*^{*} is only observed if *Y*^{*} > 0.
In particular, the actual dependent variable is:

*Y* = max(0,*Y*^{*})

For example, let *Y* be the amount of money that an individual spends on tobacco, given his or her characteristics *X*. Then *Y* > 0 if the individual is a smoker, and
*Y* = 0 if not. The Tobit model is a convenient way of modeling this type of data.

For the technical details of the Tobit model, see TOBIT.PDF. In this guided tour I will mainly focus on how to estimate a Tobit model with EasyReg.

As has been explained in TOBIT.PDF, if you ignore the fact that *Y* comes from a tructated regression model and regress *Y* on *X* using the the positive observations on *Y* only, then the OLS estimator of b will be biased, due to the fact that

where *F* and *f* are the distribution function and the density, respectively, of the standard normal distribution.

The appropriate method to estimate the Tobit model is maximum likelihood. See TOBIT.PDF for details.

The data has been generated artificially as follows.
The independent variables *X*_{1,j} and *X*_{2,j}
and the error *U*_{j} for *j* = 1,....,*n* = 500 have been
drawn independently from the standard distribution, and *Y* has been generated as:

*Y* =max(0,*X*_{1,j} + *X*_{2,j} + *U*_{j}).

Thus, if an intercept is included in the model, so that the vectors of regressions are

*X*_{j} =
(*X*_{1,j},*X*_{2,j},1)',

then the true parameter vector is _{1},b_{2},b_{3})',

- b
_{1}= 1 - b
_{2}= 1 - b
_{3}= 0

- s = 1

Now open "Menu > Single equation models > Tobit models" in the EasyReg main window, select the variables Y, X1 and X2, and keep the the default intercept, similar to running an OLS regression with intercept, until you arrive at the following window.

In general there is no need to adjust the stopping rules of the Newton iteration which is used to maximize the likelihood function. Thus, click "Tobit analysis". Then after a few seconds the maximum likelihood estimation results appear:

If you click "Continue", the module NEXTMENU will be activated:

You have seen this window before after running an OLS regression, so no further explanation is necessary.

The output is listed below. Note that I have used the option "Wald test of linear parameter restrictions" to test the joint null hypothesis:

- b
_{1}= 1 - b
_{2}= 1 - b
_{3}= 0

Tobit model: y = y* if y* > 0, y = 0 if y* <= 0, where y* = b'x + u with x the vector of regressors, b the parameter vector, and u a N(0,s^2) distributed error term. Dependent variable: Y = Y Characteristics: Y First observation = 1 Last observation = 500 Number of usable observations: 500 Minimum value: 0.0000000E+000 Maximum value: 5.4575438E+000 Sample mean: 7.2127526E-001 This variable is nonnegative, with 244 zero values. A Tobit model is therefore suitable X variables: X(1) = X1 X(2) = X2 X(3) = 1 Frequency of Y = 0: 48.80% (244 out of 500) Newton iteration succesfully completed after 5 iterations Last absolute parameter change = 0.0001 Last percentage change of the likelihood = 0.0603 Tobit model: Y = max(Y*,0), with Y* = b(1)X(1) + b(2)X(2) + b(3)X(3) + u, where u is distributed N(0,s^2), conditional on the X variables. Maximum likelihood estimation results: Variable ML estimates (t-value) [p-value] x(1)=X1 b(1)= 1.0547731 (17.0084) [0.00000] x(2)=X2 b(2)= 0.9905518 (15.2253) [0.00000] x(3)=1 b(3)= -0.0243418 (-0.3450) [0.73011] standard error of u s= 1.0635295 (21.9209) [0.00000] [The p-values are two-sided and based on the normal approximation] Log likelihood: -4.74065017126E+002 Pseudo R^2: 0.60984 Sample size (n): 500 Information criteria: Akaike: 1.912260069 Hannan-Quinn: 1.925490511 Schwarz: 1.945976933 If the model is correctly specified then the maximum likelihood parameter estimators b(1),..,b(3), minus their true values, times the square root of the sample size n, are (asymptotically) jointly normally distributed with zero mean vector and variance matrix: 1.92290870E+00 6.77554263E-01 -9.38221607E-01 5.37455447E-01 2.11638376E+00 -9.79444588E-01 -9.81136382E-01 -1.09217153E+00 2.48931672E+00 Wald test: x(1)=X1 b(1)= 1.0547731 (17.0084)(*) x(2)=X2 b(2)= 0.9905518 (15.2253)(*) x(3)=1 b(3)= -0.0243418 (-0.3450)(*) (*): Parameters to be tested Null hypothesis: 1.x(1)+0.x(2)+0.x(3) = 1. 0.x(1)+1.x(2)+0.x(3) = 1. 0.x(1)+0.x(2)+1.x(3) = 0. Null hypothesis in matrix form: Rb = c, where R = 1. 0. 0. 0. 1. 0. 0. 0. 1. and c = 1. 1. 0. Wald test statistic: 0.98 Asymptotic null distribution: Chi-square(3) p-value = 0.80630 Significance levels: 10% 5% Critical values: 6.25 7.81 Conclusions: accept accept

(*) See TOBIT.PDF for the definition of pseudo R-square.

As an example of a case for which EasyReg refuses to conduct Tobit analysis, select the variables Z, X1, X2 and the constant 1 for the intercept, and declare Z the dependent variable. Then you will get stuck here:

The problem is that Z is discrete, because I have generated it as

Z = Int(100*Y)

where the "Int" function trucates its argument to an integer, by cutting off all the digits after the decimal symbol (a dot "." in the US, a comma "," in Europe). But the Tobit model assumes that Z has a continuous distribution, conditional on Z > 0 and X1 and X2, so that the assumptions of the Tobit model do not hold. Therefore, in order to prevent you from doing bad econometrics, EasyReg will not allow you to continue.

In view of the queries I have gotten about this issue, the message in this window may not be clear enough. If so, click the "Yes" button, which opens a PDF file:

However, the same explanation, and more, can be found in TOBIT.PDF.

If the observed dependent variable *Y* is confined to an interval (*a*,*b*], where
-¥ < *a* < *b* < ¥, with
*P*[*Y* = *b*]*Y* to a new dependent variable
*Z*, say, such that *Z* Î [0,¥) and
*P*[*Z* = 0]*P*[*Y* = *b*]*Z* =
*Y* - *a*)/(*b* - *a*)].*Z* = max(0,*Z*^{*}), where
*Z*^{*} = b'*X* + *U*.
Then

To create this variable *Z*,
open Menu > Input > Transform variables, and conduct the following transformations:

- Click the "Constant = 1" button. Then a new variable "1" is created, which has the value 1 for all observations.
- Click the "Linear combination of variables" button, select "1" and use the value of
*a*as coefficient. Then a new variable with name "*a*x1" is created, which has the value*a*for all observations. I will assume that you have renamed the variable "*a*x1" as variable*A*. - Click the "Linear combination of variables" button, select "1" and use the value of
*b*as coefficient. Then a new variable with name "*b*x1" is created, which has the value*b*for all observations. I will assume that you have renamed the variable "*b*x1" as variable*B*. - Click the "Linear combination of variables" button, select the variables
*Y*and*A*, and create the linear combination I will assume that you have renamed*Y*-*A*. as*Y*-*A**YminA*. Note that now*YminA*Î(0, *b*-*a*]. - Click the "Linear combination of variables" button, select the variables
*B*and*A*, and create the linear combination I will assume that you have renamed*B*-*A*. as*B*-*A**BminA*. Note that*BminA*is a contant with value for all observations.*b*-*a* - Click the "Multiplicative transformation of variables" button, select the variables
*YminA*and*BminA*and use the powers 1 and-1 , respectively, to create the new variable "*YminA*x*BminA*^-1". I will assume that you have renamed this new variable as*YminA*/*BminA*. Note that*YminA*/*BminA*Î(0,1]. - Click the "LOG transformation: x -> ln(x)" button, and select the variable
*YminA*/*BminA*. Then the new variableLN[ will be created. Note that*YminA*/*BminA*]LN[ Î*YminA*/*BminA*](-¥,0]. - Click the "Linear combination of variables" button, select the variable
LN[ , and use the coefficient*YminA*/*BminA*]-1 to create the variable-LN[ I will assume that you have renamed this variable as*YminA*/*BminA*].*Z*. Thus,*Z*=-LN[ Now*YminA*/*BminA*].*Z*Î[0,¥), and =*P*[*Z*= 0] > 0.*P*[*Y*=*b*]

The new variable *Z* in step 8 can now be used as dependent variable in a Tobit model. However, keep in mind
that in this case a negative coefficient of an X variable implies a positive effect on the original dependent
variable *Y*, because
*Z*/¶*Y**Y*-*a*)*Y*/¶*Z*

Although needless to say (but I will say it anyhow), if *a* = 0 and *b* = 1 then you can skip the
steps 1 to 6, and use *Y* instead of *YminA*/*BminA* in step 7.

If *Y* Î [*a*,*b*), where
-¥ < *a* < *b* < ¥, with
*P*[*Y* = *a*]*Z* =
*b* - *Y*)/(*b* - *a*)]*P*[*Z* = 0]*P*[*Y* = *a*]*Z* can be created similarly to the previous steps 1 to 8, and can be used as the new
dependent variable in a Tobit model. Since now
*Y*/¶*Z**Y*.

Note that now we model the conditional distribution of *Y* by

This case cannot be handled by standard Tobit analysis.