Guided tour on user-defined maximum likelihood

In this guided tour I will explain how to conduct user-defined maximum likelihood analysis, on the basis of the data and model in

• Bierens and Carvalho (2011), "Job Search, Conditional Treatment and Recidivism: The Employment Services for Ex-Offenders Program Reconsidered", The B.E. Journal of Economic Analysis & Policy 11, Issue 1 (Topics), Article 5.
The the cross-section data involved are available as MLDATA.TXT, in EasyReg space delimited text format.

The ESEO program

During the period of 1980-1985, the National Institute of Justice sponsored a controlled experiment to evaluate the impact of reemployment programs for recent released prisoners. Three well established programs were chosen, in Boston, San Diego and Chicago, to participate in the Employment Services for Ex-Offenders Program, henceforth ESEO. A total of 2,045 prisoners who voluntarily accepted to participate were randomly assigned to either a treatment group (G=1) or a control group (G=0). Those in the first group received, besides the normal services (orientation, screening, evaluation, support services, job development seminar, and job search coaching), special services which consisted of an assignment to a follow-up specialist who provided support during the job search and the 180 days following job placement. The control group received only normal services. The inclusion of special services was a major response to the increasing belief that some past employment programs had failed because ex-inmates lost contact with their original programs.

Model variables

The dependent variables involved are:

• T_s: the duration of the job search in months, i.e., the time between the date of release and the date of placement in the first job after release;
• T_c: recidivism time in months, i.e., the time between the date of release and the date of the first arrest after release.
However, both durations are interval-censored: we only observe the events I(T_s in (0,1], I(T_s in (1,6]), I(T_s in (6,12]), I(T_s > 12), I(T_c in (0,1]), I(T_c in (1,6]), I(T_c in (6,12]), I(T_c > 12), where I(.) is the indicator function: I(true) = 1, I(false) = 0. In the data set these indicators are combined in six dummy variables:
• I(T_s in (0,1]) x I(T_c in (1,6])
• I(T_s in (0,1]) x I(T_c in (6,12])
• I(T_s in (0,1]) x I(T_c > 12)
• I(T_s in (1,6]) x I(T_c in (6,12])
• I(T_s in (1,6]) x I(T_c > 12)
• I(T_s in (6,12]) x I(T_c > 12)
because this is the way these indicators enter the log-likelihood function.

The covariates are

• G = 1 if selected in the treatment group, G = 0 if selected in the control group.
• AGE = age in years.
• CHICAGO = 1 for ex-inmates in the ESEO program in Chicago, CHICAGO = 0 elsewhere.
• SANDIEGO = 1 for ex-inmates in the ESEO program in San Diego, SANDIEGO = 0 elsewhere.
• BOSTON = 1 for ex-inmates in the ESEO program in Boston, BOSTON = 0 elsewhere.

The number of observations is 503.

Coding and estimating the model via EasyReg

The model involved and its EasyReg code are explained in detail in ML.PDF. Open "Menu > User-defined nonlinear models > Maximum likelihood", and select the variables in the above order. Then the first relevant EasyReg window is

Click "Continue":

This is the list of initial variables. EasyReg has automatically added the constant 1 to the list.

In maximum likelihood, there is no sharp distinction between dependent variables and covariates. Therefore, all the variables are labeled by Z(.).

Now build-up the EasyReg code according to the steps in ML.PDF, similar to the coding of a nonlinear regression model. The number of code lines, including the initial variables, is 75, and therefore it takes too much space and efforts to explain every step in detail. The EasyReg code can be viewed here.

It is advisable to store the code in a template. Thus, click "Store". Then the model will be stored in template TEMPLATE_ML.001 in the EASYREG.DAT folder.

Click "Model is O.K.".

You need to indicate the lower and upper bounds of the parameters. The bounds for b(1) are specified above, and for the other parameters below:

Click "Bounds O.K.".

You get this window because the model involves integrals that have to be computed numerically. To speed up the computation, I have in first instance reduced the number of grid point to 100:

Thus, choose m = 100.

The initially startvalues have been drawn randomly. However, here I will follow the same procedure as in Bierens and Carvalho (2011) and use startvalues based on preliminary estimation results, namely

• b(1) = 0.884122
• b(2) = -1.228695
• b(3) = -0.379488
• b(4) = 0.179888
• b(5) = 0.041529
• b(6) = 0.5
• b(7) = -0.040221
• b(8) = 1.707907

Click "Start". I have restarting the simplex iteration until the parameters did not change anymore (which took about half an hour).

However, this may not be the optimal solution, because the computation of the integrals was done with only 100 grid points. Therefore, I will use this solutions as the new start values. Also, we can now narrow down the parameter bounds. Thus, copy the solution, and click "Redo model/bounds". Then TEMPLATE_ML.001 will be loaded, and EasyReg will ask for the parameter bounds. We can now choose these bounds much narrower. For example, choose:

and

Click "Bounds O.K."

Now choose the default m = 500.

Paste the previous solution in the start values box, and click "Start".

At this point I have accepted the solution. Click "Done".

Click "Continue". Then the estimation results appear:

These results can be translated in more readable form as:

These results differ slightly from the results in Bierens and Carvalho (2011). The reason is that the results in the latter paper were obtained by running the computer overnight, using the "Auto restart until manual interruption" option. Indeed, the log-likelihood value in the latter case is slightly higher than in this guided tour.

The output of the last iteration run can be viewed here.