Importing Excel Data into SQL Server
Recently I had to import a huge load of surveys gathered in Excel sheets to SQL Server for further processing and analysis. Don’t ask why they were not entered on the first place to a database via some interface…
Importing an Excel sheet to SQL Server seems like a trivial task. Here is an example:
SELECT * INTO BenchmarkSurvey
FROM OPENDATASOURCE('Microsoft.Jet.OLEDB.4.0',
'Data Source=C:Benchmark SurveyReceivedSurvey1009.xls;
Extended Properties="Excel 8.0"')...[Sheet1$];
On a first look seemed the data was imported OK and I could start cleaning and processing. However, making a few sample tests to match responses from selected Excel surveys proved that a lot of the answers in Excel are imported as NULLs. Digging deeper in to the source data showed that those spreadsheet cells were merged and had mixed numbers and text values. Apparently that was causing the data to be imported as NULLs.
One solution that came to mind was to write a macro to parse all Excel files and to change the merged cells data to text. But that just didn’t seem right.
Sure enough, there was a more elegant solution. Looking closer at the OLE DB provider properties I found the IMEX property (short for Import/Export Mode). Changing the IMEX value tells the provider how to convert/format the data to fit the column type. Possible values for IMEX are 0, 1, and 2. In my case setting IMEX to 1 did the trick (IMEX=1 tells the provider to read those mixed data columns as text). Here is the final statement that imported the data from all merged/mixed cells correctly:
SELECT * INTO BenchmarkSurvey
FROM OPENDATASOURCE('Microsoft.Jet.OLEDB.4.0',
'Data Source=C:Benchmark SurveyReceivedSurvey1009.xls;
Extended Properties="Excel 8.0;IMEX=1"')...[Sheet1$];
Another interesting setting is HDR. Settings HDR=Yes indicates that the first row contains column names, not data.