0

Its regarding 'Bulk Insert' sql command where I having problem regarding some French Accents. The 'à' or 'a acute' is showing up as n++ in database table after I insert them from a text file. And this is the only character that is having problem with current conversion. I tried to modify the command using 'WITH' clause and assigning value such as 'RAW' or '1252' or 'ACP' but none of them were able to retain this particular character. The database I am using is Sql Server 2008. Any help is greatly appreciated. Thanks.

FYI, I aint using any encode or decode technique. Just trying to put whatever on text file back into database.


Thanks for the response. However, changing the column type to 'nvarchar' from 'varchar' didnt actually solve the issue. Its now appearing as '�' in the database table column. Any idea people ?? Thanks a lot in advance

Robert Harvey
  • 178,213
  • 47
  • 333
  • 501
Sanjay
  • 289
  • 2
  • 5
  • 14
  • There's a character encoding problem going on here - ensure you're using UTF-8. – m.edmondson Jun 14 '12 at 21:03
  • What datatype is the column that you're inserting the data into? It'll need to be nvarchar or ntext (although that's depreciated) in order to store unicode data. – Bridge Jun 14 '12 at 21:05
  • 2
    There isn't any way you can store the french accent characters then. You'll have to change the column to nvarchar if you want to store them. – Bridge Jun 14 '12 at 21:12
  • Hi, Thanks for the response. However, changing the column type to 'nvarchar' from 'varchar' didnt actually solve the issue. Its now appearing as '∩┐╜' in the database table column. Any idea. – Sanjay Jul 16 '12 at 21:58

2 Answers2

1

As Bridge says, you'll need NVARCHAR. You'll also need to convert the file to UTF-16, as BULK INSERT does not play nice with UTF-8 even when you're using NVARCHAR.

See this question for more information.

There are several questions here regarding converting file encodings, but for the record this worked for me:

public static void  ConvertUtf8ToUtf16(string path, int bufferSize = 4096)
    {
    string  tempPath = path + ".tmp";

    char[]  buffer = new char[bufferSize];
    using (var input = File.OpenText(path))
    using (var output = new StreamWriter(tempPath, false, Encoding.Unicode))
        {
        int  index = 0;
        int  chars;
        while ((chars = input.Read(buffer, index, bufferSize)) > 0)
            output.Write(buffer, 0, chars);
        }

    // Swap the files
    File.Delete(path);
    File.Move(tempPath, path);
    }
Community
  • 1
  • 1
  • Thanks for the response. However, changing the column type to 'nvarchar' from 'varchar' didnt actually solve the issue. Its now appearing as '∩┐╜' in the database table column. Any idea – Sanjay Jul 17 '12 at 13:52
  • That's not the whole solution. What is the character encoding of your file? `BULK INSERT` is touchy. –  Jul 17 '12 at 19:08
0

Change the column type to nvarchar.

Jirka Hanika
  • 13,301
  • 3
  • 46
  • 75
  • Hi, Thanks for the response. However, changing the column type to 'nvarchar' from 'varchar' didnt actually solve the issue. Its now appearing as '∩┐╜' in the database table column. Any idea. – Sanjay Jul 16 '12 at 21:58
  • Addional information : I am inserting those French characters from a text file to database table. I am using Bulk Insert to do that. thanks – Sanjay Jul 16 '12 at 22:00
  • @Sanjay - that bulk insert file is not in UTF-16 and that is your problem. – Jirka Hanika Jul 17 '12 at 16:51
  • So that means I need to convert the text file to UTF-16 before performing bulk insert, rite ?? – Sanjay Jul 17 '12 at 19:14
  • byte[] utf8Bytes = Encoding.UTF8.GetBytes("this is test string"); byte[] unicodeBytes = Encoding.Convert(Encoding.UTF8, Encoding.Unicode, utf8Bytes); Encoding.Unicode.GetString(unicodeBytes); \n\nThis is what I am doing before the bulk insert. Output of Encoding.Unicode.GetString(params) are being used to write another text file. Still getting some garbage output. Could anyone tell me if I am doing anything wrong while converting to UTF16 ... thanks – Sanjay Jul 23 '12 at 21:49