28

I'm using BULK INSERT to import a CSV file. One of the columns in the CSV file contains some values that contain fractions (e.g. 1m½f).

I don't need to do any mathematical operations on the fractions, as the values will just be used for display purposes, so I have set the column as nvarchar. The BULK INSERT works but when I view the records within SQL the fraction has been replaced with a cent symbol (¢) so the displayed text is 1m¢f.

I'm interested to understand why this is happening and any thoughts on how to resolve the issue. The BULK INSERT command is:

BULK INSERT dbo.temp FROM 'C:\Temp\file.csv' 
WITH (FIELDTERMINATOR = ',', ROWTERMINATOR = '\n' );
Aaron Bertrand
  • 272,866
  • 37
  • 466
  • 490
MrMatt
  • 305
  • 1
  • 4
  • 6
  • Does the data literally contain the one-half character? – Melanie Dec 21 '12 at 20:44
  • Yes, it contains the ½ character as opposed to 1/2. – MrMatt Dec 21 '12 at 20:50
  • @AaronBertrand, the Bulk Insert is: `code` BULK INSERT dbo.temp FROM 'C:\Temp\file.csv' WITH (FIELDTERMINATOR = ',', ROWTERMINATOR = '\n' ); The Collation is Latin1_General_CI_AS. How do I check if the file is marked as Unicode? – MrMatt Dec 21 '12 at 21:06
  • You can check if the file is Unicode by, for example, opening it in Notepad and then going to File > Save As. Depending on the version of Windows you should see an "Encoding" drop-down somewhere with options like ANSI, UTF-8, Unicode, Unicode big endian... – Aaron Bertrand Dec 21 '12 at 21:11
  • @AaronBertrand, thanks - it shows ANSI. – MrMatt Dec 21 '12 at 21:15

1 Answers1

53

You need to BULK INSERT using the CODEPAGE = 'ACP', which converts string data from Windows codepage 1252 to SQL Server codepage.

BULK INSERT dbo.temp FROM 'C:\Temp\file.csv' 
WITH (FIELDTERMINATOR = ',', ROWTERMINATOR = '\n', CODEPAGE = 'ACP');

If you are bringing in UTF-8 data on a new enough version of SQL Server:

[...] , CODEPAGE = '65001');

You may also need to specify DATAFILETYPE = 'char|native|widechar|widenative'.

Aaron Bertrand
  • 272,866
  • 37
  • 466
  • 490
  • This was also what I needed to correctly bulk insert ansii text with accented characters, such as è, in it. – monty Oct 16 '17 at 01:18
  • 2
    Thank you @aaron bertrand. It worked partially for me. I am facing issues while inserting accented characters. For example: ä is shown as ä – user1417294 May 02 '18 at 13:02
  • 1
    @user1417294 What is the encoding of the file, the data type and collation of the column, and the collation of the database and instance? – Aaron Bertrand May 02 '18 at 13:25