Guided tour on kernel density estimation

The basic properties of kernel density estimators are summarized in PDF file KDENSITY.PDF. Before you use the kernel density estimation module (KDENSITY), please read this PDF file first.

Note that nonparametric kernel density estimation is an advanced feature of EasyReg International. If you are a novice econometrician you should not use it.

Data

The data for this guided tour come in two versions, a space delimited EasyReg data file, DENSITYDATA.TXT, and an Excel file in CSV format, DENSITYDATA.CSV, containing 1000 independent observations on two independent random variables X and Y. Thus, this data set should be declared as cross-section data.

The random variables X and Y are generated as mixtures of independent N(-2,1) and N(2,1) distributed random variables U and V, respectively, with mixture probability 0.5. Thus, let D be a dummy variable which is independent of U and V and satisfies P[D = 1] = P[D = 0] = 1/2. Then

X = D.U + (1-D)V,

and similarly for Y. Consequently, the common marginal density of X and Y is bi-modal, with modes at -2 and 2.

Kernel estimation of the density of Y

Open Menu > Data analysis > Normal kernel density estimation, and select the variable Y in the usual way, without using a subsample. Then the following window appears.

KDENSITY regression Window 1

The default display option is to compare the kernel density estimator with the estimated normal density, but that option makes no sense in the case under review. Therefore, I have set the display option to "None of the above". The default range for the plot of the kernel density estimator of Y is between the minimum and maximum values of Y. I have set this range to [-5,5].

The next step is to select the constant c of the bandwidth. The default value is c = 1, which however is only optimal if the true density is normal. I recommend to adopt in first instance the default value 1.

KDENSITY regression Window 2

Thus, click "c OK" and then "Options OK". Then the kernel density estimation result appears.

KDENSITY regression Window 3

The result is what was expected: The density of Y is bimodal with modes at -2 and 2.

Kernel estimation of the joint density of X and Y

Next, select X and Y without using a subsample. Then the following window appears.

KDENSITY regression Window 4

I have adjusted the plot range to [-5,5][-5,5], and again chosen the default value 1 of c:

KDENSITY regression Window 5

Once you click "c OK" and then "Options OK", the bivariate kernel density data are calculated, and module PLOT3DIM is activated:

KDENSITY regression Window 3

Click "Start". Then the kernel density plot appears:

KDENSITY regression Window 7

Since in this case the joint density f(x,y) of (X,Y) is the product of their marginal densities, f(x,y) has now four modes, at (-2,-2), (-2,2), (2,-2) and (2,2). The kernel estimator of f(x,y) captures this shape quite well.

This is the end of the guided tour on kernel density estimation