Guided tour on importing Excel files in CSV format

Excel window

What you see here is a typical example of an Excel worksheet, in a format that can be converted to an EasyReg data file. The first row contains text (the variable names), and the next rows contain the data matrix. The data matrix should only contain numbers! Non-numerical data are not allowed from the second row onwards!

The file EXCELDATA.XLS displayed here contains cross-section data. To convert this file to CSV (Comma Separated Values) text format, open (in Excel) File > Save as .., and save the file as a CSV text file. In this example the Excel worksheet will be saved as file EXCELDATA.CSV.

USA style CSV format

In the USA and some other countries, for example the United Kingdom, Windows uses a dot (.) as decimal delimiter. For example, 0.5 = 1/2. In these countries, the data entries in a CSV file are separated by commas (,).

If a variable name contains a comma, Excel will automatically enclose it between quotation marks ("...") when the file is saved as a CSV file. Older versions of Excel do that too if the variable name contain blanks. Do not add quotation marks around variable names yourself in Excel, because then EasyReg cannot read the CSV file! There is no need to put quotation marks around variable names: Excel will do that itself if necessary. But if you insist to do it yourself, do it only after importing the CSV file in Notepad or Wordpad.

If there are empty cells in the data matrix, EasyReg will interpret them as missing values, otherwise you have to fill them with numbers. Empty cells in the first row containing the variable names will be converted to a question mark (?). The first variable name should be present, as otherwise EasyReg cannot determine the size of the data matrix.

To see how a CSV file looks like, and to check whether the file is suitable for EasyReg, import the file EXCELDATA.CSV into Notepad or Wordpad, and set the view option to "No wrap". The first record should contain all the variable names, as shown here:

Wordpad window 1

Scroll down to check whether the file contain text below the first record. If so, the file is not suitable for EasyReg!.

You may find rows containing two or more adjacent commas (,,). These pairs of commas represent empty cells. Also, a comma as last character in a row represents a missing value. EasyReg will fill them with a missing value code.

European style CSV format

In continental Europe and some countries elsewhere Windows uses a comma (,) as decimal delimiter. For example, 0,5 = 1/2. In these countries, the data entries in a CSV file are separated by semi-colons (;). If you live in a country where a comma is used as decimal delimiter the CSV file will look like this:

Wordpad window 2

The commas separating the data entries are replaced by semi-colons (;), and the decimal dots, if any, are replaced by decimal commas. EasyReg will check automatically whether a comma is used as decimal delimiter, and interpret the CSV file accordingly.

Importing a CSV file in EasyReg

In the EasyReg main window, open "File > Get data > Choose an Excel data file in CSV format". Then the following window appears.

DATACSV window 1

Click "Continue". Then the following window appears.

DATACSV window 2

If you know where the CSV file can be found, leave the default option "Manual" checked, click "Option OK", and then select the drive and folder were the CSV file is located.

If you check one of the other two search options and click "Option OK", the current drive or all your drives will be scanned for CSV files. The results will be listed in the list box of frame "Files found".

Now suppose that you have saved the CSV file as d:\EXCELDATA.CSV. Since you know the location of the file, leave the option "Manual" checked, click "Option OK", and change the drive to d:

DATACSV window 3

If you check "Double click to view file via Notepad (or Wordpad if too big)", and double click "d:\EXCELDATA.CSV", EasyReg will be minimized, and the CSV file will be imported in Notepad or Wordpad. However, let us continue, by double clicking "d:\EXCELDATA.CSV". Then the following window appears.

DATACSV window 4

You have to inform EasyReg what the type of the data is. Check "Cross-section data". Then a text box appears in which you can enter any information about the data. This information will be displayed if you view the current input file (via File > Current data > Show current input file).

DATACSV window 5

After entering text (if any), click "Continue". Then the following window appears.

DATACSV window 6

If the CSV file contains missing values, EasyReg needs to replace these missing values with a missing value code. This missing value code should be such that none of the actual data entries are equal to this missing value code, even after transformations. Do not enter zero as missing value code, because missing value code zero indicates no missing values! It is strongly recommended to adopt the default missing value code (-99999.99), even if you know that there are no missing values. Thus, click "Code OK". Then EasyReg will read and check the CSV file, and if the file is OK the following window appears.

DATACSV window 7

EasyReg has now analyzed the CSV file, and listed the variable names and the first 10 observations on these variables. Click "Continue". Then the following window appears.

DATACSV window 8

The data has now been converted and copied to two random access files, INPUT1.RAN and INPUT2.RAN. This window enables you to check whether the conversion has been done correctly, by comparing data entries in the converted data file with the corresponding data entries in the CSV file.

As a double-check you may wish to import the CSV file in Excel again, via the "Open Excel" button, and compare the converted data with the data in the CSV file. Of course, this only works if you have installed Microsoft Excel. Note that EasyReg works directly with the CSV text file rather than with Excel itself. Therefore, having Microsoft Excel installed on your computer is handy but not essential.

After checking the data, click "Continue". Then the following window appears.

DATACSV window 9

After reading this information about the folder EASYREG.DAT, click "Continue". Then the following window appears.

DATACSV window 10

We are going to start up EasyReg in a new folder, d:\Test, as follows. Change the drive to d:, enter the folder name Test, and click "OK", or hit the enter key. (Note that the text on the "OK" button is underlined. This indicates that this button doubles with the enter key.) Then the following window appears.

DATACSV window 11

Now click "Start EASYREG in current folder". Then the EasyReg main window appears, and you will be ready to use this data.

Trouble shooting

Importing a European style CSV file in EasyReg if Windows uses the US number setting

If you import a European style CSV file in EasyReg while Windows uses the US number setting you will end up with the following window:

DATACSV window 12

However, if you click "Try again, with a semi-colon (;) as data entry separator", EasyReg will jump back to the previous window, and then import the file correctly. The same applies if the CSV file uses a semi-colon (;) as data entry separator but a dot (.) as decimal delimiter (some versions of Excel save CSV files in this way).

Importing a US style CSV file in EasyReg if Windows uses the European number setting

Similarly, if you import a US style CSV file in EasyReg while Windows uses the European number setting you will end up with the following window:

DATACSV window 13

Again, if you click "Try again, with a comma (,) as data entry separator", EasyReg will jump back to the previous window, and then import the file correctly.

Dates

Most problems with importing CSV files in EasyReg are due to the representation of dates. For example, consider the following data file of daily stock returns:

Daily returns 1

In Wordpad the file looks like this:

Daily returns 2

If you import this file in EasyReg, you will get stuck:

Daily returns 3

The reason of course is the slash / in the date. Recall that only numerical data are allowed beyond record 1, but the date 1/3/2001 is not a number.

The problem is easy to fix: In Wordpad, replace "/" with "," and "Date" with "Month,Day,Year":

Daily returns 4

The file can now be imported in EasyReg without problems.

EasyReg time series data are either annual, quarterly, monthly or other. Other time series are time series with another frequency than annual, quartely or monthly time series. These time series contain two or more observations per year, for example bi-annual data. The option of daily and weekly data is not available. To import daily data, you need to trick EasyReg by declaring the time series as annual, starting from "year" 1, for example:

Daily returns 5

Finally, under US number setting, quarterly dates should be represented numerically in the form "year.q", where "q" is the one-digit quarter number. For example, the quarters 1 through 4 of year 2001 should be represented by 2001.1, 2001.2, 2001.3, 2001.4, respectively. Monthly dates should be represented by "year.mm", where "mm" is the month number, in two digits. For example, months 1 through 12 of year 2001 should be represented by 2001.01, 2001.02, ...., 2001.12, respectively. Of course, under European number setting the decimal dot in the dates should be replaced with a decimal comma.

This is the end of the guided tour on importing Excel files in CSV format

Back to the guided tour on how to import data.