This guided tour contains mathematical formulas and/or Greek symbols and are therefore best viewed with Internet Explorer, Opera or Google Chrome, because other web browsers (in particular Firefox) may not display the "Symbol" fonts involved. For example, "b" should be displayed as the Greek letter "beta" rather than the Roman "b". If not, reload this guided tour in Internet Explorer, Opera or Google Chrome, or make one of them your default web browser.
SMINK stands for: Sample Moments Integrating Normal Kernel. The SMINK density estimator is a variant of the normal kernel density estimator with the following additional properties:
The SMINK regression function estimator is derived from the SMINK estimator of the joint density of the dependent and independent variables, together with the SMINK estimator of the marginal density of the independent variables, in the same way as for the original densities.
The basic properties of SMINK density and regression estimators are summarized here. This summary is also provided by the SMINK regression estimation module (SMINKREG) itself. Before you use the SMINK regression module, please read at least the summary first, but I strongly recommend to read the original paper as well:
Bierens, H.J (1983): "Sample Moments Integrating Normal Kernel Estimators of Density and Regression Functions", Sankhya 45, Series B, 160-192.
In order to demonstrate how SMINK regression works, I have generated n = 500 independent
standard normally distributed random variables X1, X2, and U, and combined them into
a dependent variable
This module does not allow you to select more than two explanatory variables, because only univariate and bivariate regression functions can be plotted.
I will demonstrate the bivariate case first.
Open "Menu > Single equation models > Bierens' nonparametric SMINK regression", and select Y, X1 and X2 as the data in the usual way, with Y the dependent variable, and X1 and X2 the independent variables. Then the first SMINK regression window is:
The SMINK regression procedure requires the specification of two window width parameters,
a window width
for the SMINK estimator of
the joint density of (X1,X2), and a window width
for the SMINK regression estimator. Both
have to be contained in the interval
with k the number of X variables (k = 2 in our case), and
Click "'alpha' OK". Then the window changes to:
If you leave the option "Optimize gamma" checked and click "Continue" then
will be optimized by grid search over the interval
Choose the number of grid points, and click "Grid OK". Then after a few minutes the following window appears.
Since X1 and X2 are normally distributed, the optimal SMINK density estimator is the maximum likelihood estimator, which corresponds to g2,n = 1. But the regression function is nonlinear, so that g1,n should be less than 1.
If a nonparametric regression estimator is computed for values of the X variables for which the density is close to zero, the estimate will be unreliable. Therefore, the plot range of the X variables should not be too wide. Since X1 and X2 are normally distributed, I have chosen the plot range [-2,2] for both X1 and X2.
The grid points are the grid points of the 3-dimensional plot in the direction of the X variable involved. The default value 29 usually gives the best picture.
Once the plot range and grid points have been specified, the plot data is computed, which takes a few minutes in this case, and when done the module PLOT3DIM is activated. This module opens with a blank picture window. Once you click the "Start" button, the picture is displayed:
Note that at the corners of the plot area
In this example the plot area can easily be determined from the design, but in general you do not know the actual distribution of the X variables. In that case I recommend to open "Menu > Data analysis > Summary statistics", select the X variables involved, and then use the 10% and 90% quantile values as lower and upper bounds of the plot range. In our case we have
10% quantile X1 = -1.36670 90% quantile X1 = 1.30098 10% quantile X2 = -1.17452 90% quantile X2 = 1.32714
If we choose these quantiles as the plot range, the result looks indeed much better:
Finally, just as a warning, let me show you what happens in the latter case if you do not adjust the plot range, but just accept the minimum and maximum values:
Close to the borders of the plot area there is hardly any data to support the SMINK regression function estimator, which yield spurious results. This is not typical for SMINK regression, but applies to nonparametric kernel regression in general.
Select Y and X1 as the data in the usual way, with Y the dependent variable,
and X1 the independent variable.
Now proceed in the same way as before, i.e., choose
You now have the option to compare the SMINK regression curve with the linear regression line, but I will not choose this option. Then the plot result is:
Recall that the true regression function is