0

I have a csv file that I am trying to import using BULK INSERT. The problem is that there is a field in the file that will be quoted (with double quotes) if a comma exists within the text (not quoted if no comma exists). The existence of the extra comma is causing SQL Server to throw errors because of an incorrect number of columns during the insert.

Here is a sample data set:

928 Riata Dr,Magnolia,TX,77354,4/15/2014
22 Roberts Ave.,McKinney,TX,75069,4/15/2014
"5531 Trinity Place, #22",San Antonio,TX,78212,4/15/2014

As you can see, the third row contains a comma within the address field, thus the address field is quoted. Since the BULK INSERT command is throwing errors because of this, I'm assuming I will need to scrub the file contents before attempting to load it.

Unless someone has a better solution

To scrub the file contents I will need to open the file (with SQL), read in the contents, and do a conditional replacement of the internal comma (found within the quotes). Since that comma doesn't really need to exist, I can just replace it with '' (blank).

Then, I can handle the quotes separately after the data gets imported with an update statement to replace any other characters I don't want.

I think the logic is sound, the problem is the syntax. I can't seem to find any syntax related to REGEX in SQL Server (Booo Microsoft). Which means I would need some other way to determine if the comma appears within quotes, and replace it if so.

Any thoughts, Suggestions, Code, etc.?

Thanks in advance.

sadmicrowave
  • 39,964
  • 34
  • 108
  • 180

1 Answers1

0

This sounds too simple on the face of it, but if you can just replace the commas, can you open the csv in, say, Excel or OpenOffice Calc, and then do a find replace (commas with nothing)? I just tried with a csv of mine and it worked fine. The csv remains properly delimited.

Maybe I am missing something that prevents this, such as Excel opening this with extra cells due to the comma, in which case my answer is stupid. But it would make more sense to handle this in a spreadsheet app rather than after opening with SQL.

You may have to try delimiting with something other than commas, such as tabs or etc. I've had to do this with SQL imports before. In many cases you can save as a tab delimited txt file and upload to SQL.

Note that using Excel for this type of thing can be its own problem. For help with Excel and tab delimited SQL imports, see my answer here.

Community
  • 1
  • 1
JackArbiter
  • 5,705
  • 2
  • 27
  • 34
  • thanks for your quick reply. I am trying to do this programmatically for the purposes of automation. This will be a daily task that I do not want to execute manually. Your solution would work; however, I do not want to manually touch each file everyday. – sadmicrowave Aug 29 '14 at 18:54
  • Ah, I see, I totally understand. The most recent answer on a post like this in SO is that there is no answer other than, for instance, tab delimited importing, etc. http://stackoverflow.com/questions/21226107/sql-server-bulk-insert-csv-with-data-having-comma because SQL does not directly support CSV files :( That said, there may be some other answer out there. If for some reason down the line you decide to implement some .Net solution to this, I have had recent success with LumenWorks CSV reader. Other than that, I'm not sure what help I can be. – JackArbiter Aug 29 '14 at 19:08