34

What is the best way to upload a large csv data file into SQL server using C# ? The file contains about 30,000 rows and 25 columns.

MaxZoom
  • 7,619
  • 5
  • 28
  • 44
user3083221
  • 375
  • 1
  • 3
  • 6
  • 1
    why to use .net to upload the file while sql server itself support the import of CSV file?? – Furqan Hameedi Dec 24 '13 at 10:36
  • 1
    SQL Server doesn't have any facility to upload files to - you'll need to use some other means (e.g. a web service running on IIS or something like that). SQL Server can **load a file** from a disk it can reach - so you'll need find a way to put that file where SQL Server can read it from – marc_s Dec 24 '13 at 10:36
  • the requirement is such that we have an application through while we loop and upload files one by once into SQLserver so I need to use c# code to upload file. – user3083221 Dec 24 '13 at 10:39

8 Answers8

59

1st off, You don't need programming stuff. You can directly upload CSV files into SQL Database with SQL management tools. However, if you really need do it through programming, Just read below.

Personally, I think this approach is the most efficient and easiest way to do through programming.

In general, you can achieve it in two steps

1st step is to read the CSV file and hold the records as a DataTable.
2nd step is store the retrieved DataTable into SQL Database Table as a Bulk Entry

This is a function that returns CSV File Data as a DataTable. Call and Keep it in the memory and you can do whatever you want with it.

This function is going to return CSV Read file into DataTable.

private static DataTable GetDataTabletFromCSVFile(string csv_file_path)
    {
        DataTable csvData = new DataTable();
        try
        {
          using(TextFieldParser csvReader = new TextFieldParser(csv_file_path))
             {
                csvReader.SetDelimiters(new string[] { "," });
                csvReader.HasFieldsEnclosedInQuotes = true;
                string[] colFields = csvReader.ReadFields();
                foreach (string column in colFields)
                {
                    DataColumn datecolumn = new DataColumn(column);
                    datecolumn.AllowDBNull = true;
                    csvData.Columns.Add(datecolumn);
                }
                while (!csvReader.EndOfData)
                {
                    string[] fieldData = csvReader.ReadFields();
                    //Making empty value as null
                    for (int i = 0; i < fieldData.Length; i++)
                    {
                        if (fieldData[i] == "")
                        {
                            fieldData[i] = null;
                        }
                    }
                    csvData.Rows.Add(fieldData);
                }
            }
        }
        catch (Exception ex)
        {
           return null;
        }
        return csvData;
    }
  }

SQLBulkCopy - Use this function to insert the Retrieved DataTable into Sql Table

static void InsertDataIntoSQLServerUsingSQLBulkCopy(DataTable csvFileData)
{
    using(SqlConnection dbConnection = new SqlConnection("Data Source=ProductHost;Initial Catalog=yourDB;Integrated Security=SSPI;"))
    {
         dbConnection.Open();
         using (SqlBulkCopy s = new SqlBulkCopy(dbConnection))
         {
             s.DestinationTableName = "Your table name";
             foreach (var column in csvFileData.Columns)
                 s.ColumnMappings.Add(column.ToString(), column.ToString());
             s.WriteToServer(csvFileData);
         }
     }

Source

Contango
  • 76,540
  • 58
  • 260
  • 305
Kirk
  • 4,957
  • 2
  • 32
  • 59
  • 9
    It should be noted that this will read the whole file into the computer's memory before even starting to upload it to the server. This might be an issue if you are uploading lots of big files. A better approach might be to create an array of rows for chunks of the file and call the public 'void WriteToServer(DataRow[] rows)' overload. – Martin Brown Oct 27 '14 at 12:52
  • 1
    @MartinBrown I think that would cause high traffic unnecessarily. Doing it one time as a bulk would be easy go. rolling back when there is an error is also would be easy. – Kirk Apr 07 '16 at 19:55
  • 2
    If you see `Microsoft.VisualBasic.FileIO does not exist` click [this link](http://stackoverflow.com/a/17146200/283650). – Reinstate Monica - Goodbye SE Jun 15 '16 at 14:12
  • Loading millions of data in datatable would lead to out of memory exception.Is there any othery way to do this task?? – I Love Stackoverflow Nov 14 '16 at 06:47
  • @Learning If you can notice there is a foreach loop in the code. you can set your own limiter upto 10,000 Rows or something similar. – Kirk Nov 14 '16 at 06:51
  • Actually i too have a similiar requirement like this like i need to import data from another server which can be oracle or mysql or excel through code.is is possible to use ssis functionality through backend from c# code? – I Love Stackoverflow Nov 14 '16 at 06:57
  • Does this solution assume that the receiving SQL table have columns that are only something like VARCHAR or NVARCHAR? I tried this solution, but got an error of "given ColumnMapping does not match up with any column in the source or destination". My table's columms has data types like INT, REAL, etc. – Rod Mar 15 '18 at 21:30
  • Check the columns in number. Automatic (Implicit) convention can handle the different data types otherwise you will get an error saying exactly what the problem is. – Kirk Mar 16 '18 at 06:20
5

Here is another way of uploading CSV to database via programming. Cinchoo ETL, an open source library can do the job of uploading CSV file to database using DataReader, which is streaming approach with little to none memory overhead.

Here is sample how to do it

string connectionstring = @"#YOUR DB ConnectionString#";
using (SqlBulkCopy bcp = new SqlBulkCopy(connectionstring))
{
    using (var p = new ChoCSVReader("#YOUR CSV FILE#").WithFirstLineHeader())
    {
        bcp.DestinationTableName = "#TABLENAME#";
        bcp.EnableStreaming = true;
        bcp.BatchSize = 10000;
        bcp.BulkCopyTimeout = 0;
        bcp.NotifyAfter = 100;
        bcp.SqlRowsCopied += delegate (object sender, SqlRowsCopiedEventArgs e)
        {
            Console.WriteLine(e.RowsCopied.ToString("#,##0") + " rows copied.");
        };
        bcp.WriteToServer(p.AsDataReader());
    }
}
Cinchoo
  • 6,088
  • 2
  • 19
  • 34
  • how can I get an IEnumerable or IList from this so I can load to grid. I have two steps _at runtime_ `load CSV with new headers dynamically to DB, create a new table with all the headers from the CSV` this code above is helping with this, but in the same call I want to get back an IList from the table also, so in step 2 how can I get `newTableName.toIList();` it would be great if we could load that as well – aggie Nov 28 '18 at 06:11
  • Use AsDataTable() on ChoCSVReader. Then use it to bind to Grid also write to table. – Cinchoo Nov 28 '18 at 13:59
  • Extended documentation can be found on code project: https://www.codeproject.com/Articles/1145337/Cinchoo-ETL-CSV-Reader – DtechNet Jan 14 '19 at 21:22
  • this certainly does the trick. Tip: in the CSV file, let the first column be blank (with a header) for your auto increment the database column. – alvinchesaro Jun 26 '19 at 10:08
5

The best way I found to import large CSV files into SQL Server is by using SqlBulkCopy along with IDataReader implementation. The good thing about it is that you're not reading the entire file into memory (which is the case using DataTable approach) and you can control the size of the batch which gets sent to SQL Server. The bad thing about it is that you have to implement IDataReader which is one of the longest MS interfaces I've seen.

I wrote a nuget package that does the trick for you. It uses the awesome CsvHelper package so there's very little config required. The simplest scenario would look like this:

//Instantiate the reader, providing the list of columns which matches 1 to 1 the data table structure.
var dataReader = new CsvDataReader(filePath,
    new List<TypeCode>(5)
    {
        TypeCode.String,
        TypeCode.Decimal,
        TypeCode.String,
        TypeCode.Boolean,
        TypeCode.DateTime
    });

bulkCopyUtility.BulkCopy("TableName", dataReader);

There are also additional configuration options for more complex scenarios (flexible column mapping, additional static column values which are not present in the csv file, value transformation). If you're interested, the project is on Github and available as a nuget package.

For reference, here's how to use SqlBulkCopy with IDataReader:

public void BulkCopy(string tableName, IDataReader dataReader, Action<SqlBulkCopy>  configureSqlBulkCopy)
{
    using (SqlConnection dbConnection = new SqlConnection(connectionString))
    {
        dbConnection.Open();

        using (SqlBulkCopy bulkCopy = new SqlBulkCopy(dbConnection))
        {
            bulkCopy.BatchSize = 3000; //Data will be sent to SQL Server in batches of this size
            bulkCopy.EnableStreaming = true;
            bulkCopy.DestinationTableName = tableName;

            //This will ensure mapping based on names rather than column position
            foreach (DataColumn column in dataReader.GetSchemaTable().Columns)
            {
                bulkCopy.ColumnMappings.Add(column.ColumnName, column.ColumnName);
            }

            //If additional, custom configuration is required, invoke the action
            configureSqlBulkCopy?.Invoke(bulkCopy);

            try
            {
                // Write from the source to the destination.
                bulkCopy.WriteToServer(dataReader);
            }
            finally
            {
                dataReader.Close();
            }
        }
    }
}
zwcloud
  • 4,546
  • 3
  • 40
  • 69
2

Use System.Data.SqlClient.SqlBulkCopy class to insert data into Sql tables. To use that class you also need to convert CVS data to DataTable, see here one of the ways.

Community
  • 1
  • 1
pakeha_by
  • 2,081
  • 1
  • 15
  • 7
1

This sounds like a perfect job for SSIS. It's a free part of SQL Server, can loop through all the csv files in a folder, is very fast, and has excellent error handling and logging.

benjamin moskovits
  • 5,261
  • 1
  • 12
  • 22
1

This technique uses the SQLBulkCopy() facility, but does not read the entire file into memory.

The trick is that it implements a IDataReader class to read the .csv file.

https://www.codeproject.com/Tips/1029831/Fast-and-Simple-IDataReader-Implementation-to-Read

Be Kind To New Users
  • 9,672
  • 13
  • 78
  • 125
1

You can also use Bulk Insert

Public Shared Function bulkQuery()

        Dim query As StringBuilder = New StringBuilder

        query.Append("USE Import_DB BULK INSERT dbo.[Insert_Table] FROM")
        query.Append(" 'C:\Insert_Table.csv' ")
        query.Append("With (FIELDTERMINATOR = ',', ROWTERMINATOR = '\n')")

        Return query.ToString

    End Function

Be carfully here though since tablename and csv name has to be identical as well as the column count in the csv has to be the same as in the pre-defined table.

ruedi
  • 5,365
  • 15
  • 52
  • 88
0
    private void GetDataTabletFromCSVFile(string fileName)
    {
        DataTable dt = new DataTable();
        //dt.TableName = fileName;

        try
        {
            using (TextFieldParser csvReader = new TextFieldParser(fileName))
            {
                csvReader.SetDelimiters(new string[] { "," });
                csvReader.HasFieldsEnclosedInQuotes = true;
                string[] colFields = csvReader.ReadFields();
                //foreach (string column in colFields)
                //{
                //    DataColumn datecolumn = new DataColumn(column);
                //    datecolumn.AllowDBNull = true;
                //    dt.Columns.Add(datecolumn);
                //}
                dt.Columns.AddRange(new DataColumn[8] {
                    new DataColumn("Symbol", typeof(string)),
                new DataColumn("ISIN", typeof(string)),
                new DataColumn("Company", typeof(string)),
                new DataColumn("FirstListingDate", typeof(string)),
                new DataColumn("FaceValue", typeof(string)),
                new DataColumn("PaidUpValue", typeof(string)),
                new DataColumn("MarketLot",typeof(string)),
                new DataColumn("industry",typeof(string))
                });
                while (!csvReader.EndOfData)
                {
                    string[] fieldData = csvReader.ReadFields();
                    //Making empty value as null
                    for (int i = 0; i < fieldData.Length; i++)
                    {
                        if (fieldData[i] == "")
                        {
                            fieldData[i] = null;
                        }
                    }
                    dt.Rows.Add(fieldData);
                }
                var builder = new ConfigurationBuilder()
                    .SetBasePath(Path.GetDirectoryName(System.Reflection.Assembly.GetExecutingAssembly().Location))
                    .AddJsonFile("appsettings.json");

                var configuration = builder.Build();
                string DBconnection = configuration.GetSection("ConnectionString").Value;
                using (SqlConnection dbConnection = new SqlConnection(DBconnection))
                {
                    dbConnection.Open();
                    using (SqlBulkCopy s = new SqlBulkCopy(dbConnection))
                    {
                        s.DestinationTableName = "Static.dbo.Securitiesinfo";
                        foreach (var column in dt.Columns)
                            s.ColumnMappings.Add(column.ToString(), column.ToString());
                        s.WriteToServer(dt);
                    }
                }

            }
        }
        catch (Exception ex)
        {
            var x = ex;
        }

    }
vijay
  • 1