13

I will use the following sql to read data from excel, but sometimes I need to skip first several rows. e.g the real data begins from line 5, so I need to skip the first 4 rows, is that doable?

 SELECT *
FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
    'Excel 12.0;HDR=YES;Database=c:\daniel\test.xls',
    'SELECT * FROM [sheet1$]');
Daniel Wu
  • 5,853
  • 12
  • 42
  • 93

2 Answers2

19

Use a range [sheet1$A5:Z] instead of the entire sheet [sheet1$]

SELECT *
FROM OPENROWSET(
    'Microsoft.ACE.OLEDB.12.0',
    'Excel 12.0;HDR=YES;Database=c:\daniel\test.xls',
    'SELECT * FROM [sheet1$A5:Z]'
);
Anon
  • 10,660
  • 1
  • 29
  • 31
  • Can this be done to exclude the last five rows where the total number of rows will change each time? My Excel spreadsheet has footer text that always starts one row after the last result row and goes for five rows. But the number of rows will always be different. Currently if you try to import the footer text breaks the import because it does conform to the data types etc. – TravisPUK Jan 26 '17 at 12:28
5

This will number the rows being obtained, with no specific order (as luck would have it):

SELECT *
FROM (
  SELECT *, ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS rownum
  FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
    'Excel 12.0;HDR=YES;Database=c:\daniel\test.xls',
    'SELECT * FROM [sheet1$]')
) s
WHERE rownum > 4;

You may want to specify some order, if you see fit, by changing the rownum definition like this:

ROW_NUMBER() OVER (ORDER BY specific_column_list) AS rownum
Andriy M
  • 76,112
  • 17
  • 94
  • 154
  • that will slow down the ETL process to do row_number(). And I am not sure whether sql server can load the data in parallel, if so the rownum can't reflex the position in the file, such as thread 1 load the first 100 rows, thread 2 load the second 100 rows, then it's possible the second 100 rows were loaded first into the database – Daniel Wu Feb 08 '11 at 23:39
  • @Daniel Wu: So what have we got? With an ordered ROW_NUMBER the process is very likely to slow down, yes. And the unordered one is very likely to assign row numbers differently from the actual order of rows in the excel sheet. And so my answer becomes almost completely rubbish in this situation. Sorry, seems like I've run out of options for you for the time being. If only this: could you possibly number the rows in the original excel document? That is, add a column with values representing the actual row numbers, so you could filter on it in SQL. – Andriy M Feb 09 '11 at 00:05