3

I'm building a system that reads 5 CSV files each month. These files are supposed to follow a certain format and ordering. I have one master table and 5 temporary tables. Each CSV file is read first and then bulk inserted into its corresponding temporary table. After bulk inserting the 5 csv files into their respective temporary tables I once again insert all the records from the temporary table to the master table. This makes sure that all files are uploaded first before inserting the data to the master table.

I built this system using ASP.net and during debugging and testing everything went fine. The problem occurs whenever I deploy the application to a production server. After I deployed the application I used the same csv files I uploaded during development and testing and the system shows a data conversion error from string to date time format.

I tried many things to fix this but it seems the problem still persist. I tried changing the collation of the production database to the same one I used during development. I also tried changing some regional settings in the production server but it still doesn't work.

I thought maybe I can handle this programmatically and instead of bulk inserting from the temporary tables to the master table I would write some kind of a for loop that would insert each record manually to the master table, but then I suppose it would create a performance issue since I'll be inserting around 100,000 records each time.

I wonder if anyone has faced a similar issue during deployment. It still seems weird to me that the behaviour of the application changed after deployment.

following is a portion of the code where it uploads the inventory.csv file to the server and then bulk inserting the csv into a temporary table TB_TEMP_INVENTORY then inserting the records from temp to the master table TB_CATTLE. this is done to 4 other files and is almost identical to this.

        OleDbConnection conn = new     OleDbConnection(ConfigurationManager.AppSettings["LivestockConnectionString"]);
        OleDbCommand comm;
        OleDbDataAdapter adapter;
        DataTable table = new DataTable();
        string file = string.Empty;
        string content = string.Empty;
        StreamReader reader;
        StreamWriter writer;
        string month = monthDropDownList.SelectedValue;
        string year = yearDropDownList.SelectedItem.Text;

        // upload inventory file
        file = System.IO.Path.GetFileName(inventoryFileUpload.PostedFile.FileName);
        inventoryFileUpload.PostedFile.SaveAs("C://LivestockCSV//" + file);

        // clean inventory file
        file = "C://LivestockCSV//" + file;
        reader = new StreamReader(file);
        content = reader.ReadToEnd();
        reader.Close();
        writer = new StreamWriter(file);
        writer.Write(content.Replace("\"", ""));        // remove quotation
        writer.Close();
        writer = new StreamWriter(file);
        writer.Write(content.Replace(",NULL,", ",,"));  // remove NULL
        writer.Close();
        writer = new StreamWriter(file);
        writer.Write(content.Replace(",0,", ",,"));     // remove 0 dates
        writer.Close();
        writer = new StreamWriter(file);
        writer.Write(content.Replace(",0", ","));       // remove 0 dates at eol
        writer.Close();

        try
        {
            conn.Open();
            comm = new OleDbCommand("TRUNCATE TABLE TB_TEMP_INVENTORY", conn);    // clear temp table
            comm.ExecuteNonQuery();

            // bulk insert from csv to temp table
            comm = new OleDbCommand(@"SET DATEFORMAT DMY;
                                    BULK INSERT TB_TEMP_INVENTORY
                                    FROM '" + file + "'" +
                                    @" WITH
                                    (
                                        FIELDTERMINATOR = ',',
                                        ROWTERMINATOR = '\n'
                                    )", conn);
            comm.ExecuteNonQuery();

            // check if data for same month exists in cattle table
            comm = new OleDbCommand(@"SELECT *
                                    FROM TB_CATTLE
                                    WHERE Report='Inventory' AND Month=" + month + " AND Year=" + year, conn);

            if (comm.ExecuteScalar() != null)
            {
                comm = new OleDbCommand(@"DELETE
                                        FROM TB_CATTLE
                                        WHERE Report='Inventory' AND Month=" + month + " AND Year=" + year, conn);
                comm.ExecuteNonQuery();
            }


            // insert into master cattle table
            comm = new OleDbCommand(@"SET DATEFORMAT MDY;
                                    INSERT INTO TB_CATTLE(ID, Sex, BirthDate, FirstCalveDate, CurrentUnit, OriginalCost, AccumulatedDepreciation, WrittenDownValue, NetRealizableValue, CapitalGainLoss, Month, Year, Report, Locked, UploadedBy, UploadedAt)
                                    SELECT DISTINCT ID, Sex, BirthDate, FirstCalveDate, CurrentUnit, 0, 0, 0, 0, 0, " + month + ", " + year + @", 'Inventory', 0, 'Admin', '" + DateTime.Now + @"'
                                    FROM TB_TEMP_INVENTORY", conn);
            comm.ExecuteNonQuery();
            conn.Close();
        }
        catch (Exception ex)
        {
            ClientScript.RegisterStartupScript(typeof(string), "key", "<script>alert('" + ex.Message + "');</script>");
            return;
        }
moeabdol
  • 4,779
  • 6
  • 44
  • 43
  • 2
    Your environment has changed and so the behaviour of you application changed. What is the SQL error - try supplying some sample data i.e. an example of the time field that worked in dev/test but failed in production. – Shaun Wilde Sep 03 '11 at 07:38
  • excellent post @Siva . SSIS is defiantly a tool i want to learn. i went through the post and created a small test project and the insertion went on smoothly. is there a way to include the functionality of SSIS in my already finished application. is there a way i can instantiate or call an ssis object to trigger reading the csvs into my database. thanks for pointing me to this – moeabdol Sep 03 '11 at 20:40
  • @moeabdol: I think is better to encapsulate all SQL code into a stored procedure with two parameters `CREATE PROCEDURE pImportLiveStock (@pMonth TINYINT, @pYear SMALLINT) AS TRUNCATE TABLE TB_TEMP_INVENTORY;...`. – Bogdan Sahlean Sep 05 '11 at 15:45

2 Answers2

2

You don't specify how you are doing the insert, but a reasonable option here would be something like SqlBulkCopy, which can take either a DataTable or an IDataReader as input; this would give you ample opportunity to massage the data - either in-memory (DataTable), or via the streaming API (IDataReader), while still using an efficient import. CsvReader is a good option for loading the CSV.

The other option is to use a very basic insert into the staging table, and massage the data via TSQL code.

Re why has it changed between dev/production; the most likely answers are:

  • the data you used in dev was not representative
  • there is an environmental/configuration difference between the two
Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • i'm using oledb provider from within my application and it doesn't support bulk insert – moeabdol Sep 03 '11 at 09:22
  • @moeabdol if you are talking to sql server from .NET, why not use SqlClient? – Marc Gravell Sep 03 '11 at 10:12
  • the oledb provider is more generic when it comes to back-end databases. if we changed the database from sql server to oracle or mysql anytime in the future oledb will handle them without any changes to the code. on the other hand, if u use sqlclient then you are tied to use ms sql server only – moeabdol Sep 03 '11 at 10:16
  • 1
    @moeabdol I suppose you've missed the recent announcements - you might consider deprecation when thinking oledb is a good, generic, long-term choice: http://blogs.msdn.com/b/sqlnativeclient/archive/2011/08/29/microsoft-is-aligning-with-odbc-for-native-relational-data-access.aspx Also see the "deprecated MDAC/WDAC components" section here: http://msdn.microsoft.com/en-us/library/ms810810.aspx – Aaron Bertrand Sep 03 '11 at 12:56
  • 1
    @moeabdol unless you actively test for those platforms, chances are it already won't be a trivial change - IMO it is a false concern. – Marc Gravell Sep 03 '11 at 13:01
  • @Marc Gravell so what you mean is that the code is correct but i need to fix an environment problem? – moeabdol Sep 03 '11 at 13:48
  • @moeabdol I can only offer suggestions, since I have neither your code nor your environment nor your data – Marc Gravell Sep 03 '11 at 15:23
  • @Marc Gravell theoretically the system is complete and practically the code/sql does the job during development. its when i move the code into production that it starts showing the conversion error. i'm guessing as todda.speot.is has mentioned its a locale setting maybe. i try to think of other scenarios but i fail. thanks a lot the code is on the original post above – moeabdol Sep 03 '11 at 16:26
2

1) Check SQL Server LANGUAGE and DATEFORMAT settings for dev/testing & production env.:

DBCC USEROPTIONS

2) What date format is used in CSV files (source) ?

3) What data type is used for date/time field (destination) ?

DECLARE @v VARCHAR(10) = '2010-08-23';

SET DATEFORMAT mdy;
SELECT CAST(@v AS DATETIME)
        ,CAST(@v AS DATE)
        ,YEAR(CAST(@v AS DATETIME))
        ,MONTH(CAST(@v AS DATETIME))
        ,DAY(CAST(@v AS DATETIME));

SET DATEFORMAT dmy;
SELECT CAST(@v AS DATETIME)
        ,CAST(@v AS DATE);
Bogdan Sahlean
  • 19,233
  • 3
  • 42
  • 57
  • i executed dbcc useroptions on both sql servers and they show the same result set. the date format in sql server is mdy. the date format in csv files is dmy. – moeabdol Sep 03 '11 at 08:19
  • 1
    What about the server itself? If your ASP .NET application is loading the CSV itself then perhaps the system locale is coming into effect. Dates will be parsed by the application according to the locale of the service account, unless specified otherwise. – ta.speot.is Sep 03 '11 at 08:23
  • What date format is used in CSV files (source) ? What data type is used for date/time field (destination) ? Take a look at my example to see how DATEFORMAT option influences conversion from VARCHAR to DATETIME and DATE data type. – Bogdan Sahlean Sep 03 '11 at 08:25
  • todda.speot.is i tried to match locale settings, but still it doesn't seem to work. i tried installing my development environment on the production server for this reason and still during debugging it will accept the csv files and upload the data to temporary and then to master. but when i use the deployed version on the same machine it will only upload to temporary tables and then show me conversion error during data insertion to master table – moeabdol Sep 03 '11 at 08:28
  • 1
    I assume that the application is running in the context of a different user account then. (From memory) if you're using Visual Studio and running your ASP .NET application it will run in a container process belonging to your user account (`Locale Settings A`), and most likely the deployed application runs in an IIS worker thread in the IIS service account (`Local Settings B`). Try dumping the current locale to a temporary file in both scenarios. – ta.speot.is Sep 03 '11 at 08:34
  • @@sahlean this is an example of one record in the csv 324944,C,16/01/2008,26/03/2010,80 the date format is dmy and in the sql server destination the datatype is date time. your example is very clear. i don't know where am i getting it wrong – moeabdol Sep 03 '11 at 08:36
  • todda.speot.is how can i find the current locale settings during runtime? thanks – moeabdol Sep 03 '11 at 08:44
  • 1
    (1) Using mdy DATEFORMAT option (testing environment), the conversion from '16/01/2008' to datetime data type will not work. (2) Try to use a safe date format (safe for DATETIME fields) in CSV files: YYYYMMDD (no separator). This is format 112 used by [CONVERT function](http://msdn.microsoft.com/en-us/library/ms187928%28v=SQL.100%29.aspx). – Bogdan Sahlean Sep 03 '11 at 08:49
  • unfortunately i can't manipulate the csv files as they are outputted by another system which i have no control over. moreover, i explicitly specify dateformat dmy during upload and date format mdy during insertion. i think this takes care of time formats during upload – moeabdol Sep 03 '11 at 09:09
  • (1) Create a new SQL Profiler trace by selecting SP:StmtStarting and SQL:StmtStarting events. Please make sure is started. (2) Take a small amount of data with problems and try to import. (3) Check SQL Profiler to see INSERT statements executed. What is the format for date values ? – Bogdan Sahlean Sep 03 '11 at 09:35
  • in my code i first use a bulk insert sql statement and i define the csv file's path and the temporary table where i want to bulk insert followed by another sql which inserts all records from temp table to master table. the sql profiler shows me those statements only and i can't see individual records with their date values – moeabdol Sep 03 '11 at 10:02
  • I am still not convinced that DATEFORMAT setting is the same both environments. For dev. & production env., please execute this query `SELECT * FROM sys.dm_exec_sessions` during import and see what session_id has the connection used for import (for ex. session_id 54). Then execute the following query (both env.) `SELECT s.date_format FROM sys.dm_exec_sessions s WHERE s.session_id = 54`. What are the results ? – Bogdan Sahlean Sep 03 '11 at 10:24
  • Where is the error ? At BULK INSERT or at INSERT ... SELECT ... ? – Bogdan Sahlean Sep 03 '11 at 10:31
  • @@sahlean the error is at the insert statement after the bulk insert. so during the bulk insert to the temporary tables its ok. but inserting from the temporary to the master is where the error occurs. – moeabdol Sep 03 '11 at 10:35
  • @@sahlean the error msg is "conversion failed when converting date and/or time from character string." – moeabdol Sep 03 '11 at 11:09
  • Can you publish the second INSERT statement ? Also, check the structure of "temp" and master tables in both environments. Be attention at date/time fields (data type, nullability, default value, etc.): SELECT c.* FROM INFORMATION_SCHEMA.COLUMNS c WHERE c.TABLE_SCHEMA = 'dbo' AND c.TABLE_NAME = 'Order'. – Bogdan Sahlean Sep 03 '11 at 11:20
  • SET DATEFORMAT MDY; INSERT INTO TB_CATTLE(ID, Sex, BirthDate, FirstCalveDate, CurrentUnit, OriginalCost, AccumulatedDepreciation, WrittenDownValue, NetRealizableValue, CapitalGainLoss, Month, Year, Report, Locked, UploadedBy, UploadedAt) SELECT DISTINCT ID, Sex, BirthDate, FirstCalveDate, CurrentUnit, 0, 0, 0, 0, 0, " + month + ", " + year + @", 'Inventory', 0, 'Admin', '" + DateTime.Now + @"' FROM TB_TEMP_INVENTORY – moeabdol Sep 03 '11 at 11:31
  • 2
    Try (1) replacing `... 'Admin', '" + DateTime.Now + @" ...` with `... 'Admin', GETDATE(), " + @" ...` or (2) format `DateTime.Now` using 126 style (T-SQL CONVERT function): `String.Format(" '{0:yyyy-MM-ddThh:mm:ss}' ", DateTime.Now)` or (3) add `GETDATE()` as default expression to UploadedAt field and remove UploadedAt from INSERT. – Bogdan Sahlean Sep 03 '11 at 12:00