2

I have a String to Date conversion problem using SQL Bulkcopy in asp.net 3.5 with C#

I read a large CSV file (with CSV reader). One of the strings read should be loaded into a SQL server 2008 Date column.

If the textfile contains for example the string '2010-12-31', SQL Bulkcopy loads it without any problems into the Date column.

However, if the string is '20101231', I get an error:
The given value of type String from the data source cannot be converted to type date of the specified target column

The file contains 80 million records so I cannot create a datatable....

SqlBulkcopy Columnmappings etc. are all ok. Also changing to DateTime does not help.

I tried

SET DATEFORMAT ymd;

But that does not help.

Any ideas how to tell SQL Server to accept this format? Otherwise I will create a custom fix in CSV reader but I would prefer something in SQL.

update Following up on the two answers, I am using SQL bulkcopy like this (as proposed on Stackoverflow in another question):

The CSV reader (see the link above on codeproject) returns string values (not strong typed). The CSVreader implements System.Data.IDataReader so I can do something like this:

using (CsvReader reader = new CsvReader(path)) 
using (SqlBulkCopy bcp = new SqlBulkCopy(CONNECTION_STRING))
{ bcp.DestinationTableName = "SomeTable"; 
  // columnmappings
  bcp.WriteToServer(reader); } 

All the fields coming from the iDataReader are strings, so I cannot use the c# approach unless I change quite a bit in the CSVreader

My question is therefore not related on how to fix it in C#, I can do that but i want to prevent that.

It is strange, because if you do a in sql something like

 update set [somedatefield] = '20101231' 

it also works, just not with bulkcopy.

Any idea why?

Thanks for any advice, Pleun

Pleun
  • 8,856
  • 2
  • 30
  • 50

3 Answers3

3

Older issue, but wanted to add an alternative approach.

I had the same issue with SQLBulkLoader not allowing DataType/culture specifications for columns when streaming from IDataReader.

In order to reduce the speed overhead of constructing datarows locally and instead have the parsing occur on the target, a simple method I used was to temporarily set the thread culture to the culture which defines the format in use - in this case for US format dates.

For my problem - en-US dates in the input (in Powershell):

[System.Threading.Thread]::CurrentThread.CurrentCulture = 'en-US'
<call SQLBulkCopy>

For your problem, you could do the same but since the date format is not culture specific, create a default culture object (untested):

CultureInfo newCulture = (CultureInfo) System.Threading.Thread.CurrentThread.CurrentCulture.Clone();
newCulture.DateTimeFormat.ShortDatePattern = "yyyyMMDD;
Thread.CurrentThread.CurrentCulture = newCulture;

I found allowing the database server to perform the type conversions once they've gotten through the SQLBulkCopy interface to be considerably faster than performing parsing locally, particularly in a scripting language.

Rob Knights
  • 237
  • 4
  • 6
2

If you can handel it in C# itself then this code will help get the date in the string as a DateTime object which you can pass directly

//datestring is the string read from CSV
DateTime thedate = DateTime.ParseExact(dateString, "yyyyMMdd", null);

If you want it to be formatted as string then:

string thedate = DateTime.ParseExact(dateString, "yyyyMMdd", null).ToString("yyyy-MM-dd");

Good luck.

Update

In your scenario i don't know why date is not automatically formatted but from C# you need to get in and Interfere in the process of passing the data to the WriteToServer() method. Best i think you can do (keeping in mind the Performance) is to have a cache of DataRow items and Pass them to the WriteToServer() method. I will just write the sample code in a minute...

//A sample code.. polish it before implementation
//A counter to track num of records read
long records_read = 0;
While(reader.Read())
{
    //We will take rows in a Buffer of 50 records
    int i = records_read;//initialize it with the num of records last read
    DataRow[] buffered_rows = new DataRow[50];
    for(;i<50 ;i++)
    {
        //Code to initialize each rows with the data in the reader
        //.....
        //Fill the column data with Date properly formatted
        records_read++;
        reader.Read();
    }
    bcp.WriteToServer(buffered_rows);
}

Its not full code but i think you can work it out...

Shekhar_Pro
  • 18,056
  • 9
  • 55
  • 79
  • Thanks for your help, but as shown above I cannot do it in C# without major update in the CSV reader. – Pleun Jan 20 '11 at 22:28
  • Now that's a weird behavior but as @Jon said you need to bite the bullet and get in the process of passing the data to the writetoserver method – Shekhar_Pro Jan 21 '11 at 03:35
0

It's not entirely clear how you're using SqlBulkCopy, but ideally you shouldn't be uploading the data to SQL Server in string format at all: parse it to a DateTime or DateTimeOffset in your CSV reader (or on the output of your CSV reader), and upload it that way. Then you don't need to worry about string formats.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • I have updated the question with a short explanation on how I use SqlBulkCopy. I know it is not ideal but the CSV reader only returns strings (and with 80 million rows it would be nice if SQL Server would accept them directly). It is strange, because if you do a sql update set [somedatefield] = '20101231' it also works, just not with bulkcopy. – Pleun Jan 20 '11 at 22:31
  • 3
    @Pleun: I dare say there's a way you can mangle it - but if I were you I would bite the bullet and perform the conversion yourself. It's simply a nicer solution to have each value in the appropriate type - and it means you can perform validation early, at the client level. – Jon Skeet Jan 20 '11 at 22:45
  • 1
    Thanks to both of you - I will proceed as you suggested and bite the bullet. – Pleun Jan 21 '11 at 09:26
  • ha ha.. and ya bite it hard ;-) – Shekhar_Pro Jan 21 '11 at 11:47