2

I am trying to export a SQL Server table with 1 million rows and 45 columns to a .csv file for the user to download via the web interface but it takes so long that I eventually have to stop the process manually.

I use a SqlDataReader and write into the file as the reader reads to avoid memory problems. The code works for small tables (less than 3k rows) but the large one keeps running and the destination file stays at 0 KB.

using (spContentConn) { using (var sdr = sqlcmd.ExecuteReader())
    using (CsvfileWriter)
    { 
        DataTable Tablecolumns = new DataTable();

        for (int i = 0; i < sdr.FieldCount; i++)
        {
            Tablecolumns.Columns.Add(sdr.GetName(i));
        }

        CsvfileWriter.WriteLine(string.Join("~", Tablecolumns.Columns.Cast<DataColumn>().Select(csvfile => csvfile.ColumnName)));

        while (sdr.Read())
            for (int j = Tablecolumns.Columns.Count; j > 0; j--)
            {
                if (j == 1)
                    CsvfileWriter.WriteLine("");
                else
                    CsvfileWriter.Write(sdr[Tablecolumns.Columns.Count - j].ToString() + "~");
            }
    }

I used the same answer recommended in this thread but still doesn't work. Please help. export large datatable data to .csv file in c# windows applications

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Saphiros
  • 31
  • 1
  • 6
  • https://joshclose.github.io/CsvHelper/ – VDWWD Jul 28 '19 at 16:58
  • Thanks. Should I use a data table first before writing into the file with CsvHelper? I don't expect a list is right with large data but wouldn't a datatable take as much time? – Saphiros Jul 28 '19 at 18:46
  • refer https://stackoverflow.com/questions/20759302/upload-csv-file-to-sql-server/51580615#51580615 – Cinchoo Jul 29 '19 at 20:08

1 Answers1

0

It is not clear from the .NET documentation whether FileWriter has efficient buffering, therefore I always use a BufferedStream instead when I need to read/write large volumes of data. With a stream, you would have to write byte data instead of strings, but that requires only a minor adaptation of your code.

It also looks like you are reading and writing the columns of a DataTable in a loop, which would affect performance. Since the number and order of the columns would not change during an export operation, consider using the positional index to access the column values instead. It would also be better to write one row at a time instead of one column at a time.

Finally, you are using a data-reader, so that should provide the best throughput of data from your SQL Server (limited by your server and bandwidth, obviously). This would also suggest that the performance bottleneck is in the way that your data is being written to file.

For comparison, I just wrote 1,000,000 rows of 45 columns to a text file in under 60 seconds. Granted that my code does not read from a database, but that should still provide a good enough baseline for you.

RWRkeSBZ
  • 723
  • 4
  • 11
  • Thanks for replying. I'm only using the headers into the datatable, for the rows, I use a SQLDataReader. I can't use indexes on colun names because the number of columns varies based on which table the user wants to export. Based on your feedback, I commented the section that writes to the the file and kept the reader part, turns out this part takes a lot of time too. – Saphiros Jul 28 '19 at 18:52
  • I assumed your query was already ruled out as being slow, but it seems that might not be the case. What is your query like? It could be inefficient because your tables lack proper indexes. Do you know how to analyse query performance with SQL Server execution plans (assuming you are using SQL Server)? – RWRkeSBZ Jul 28 '19 at 19:45
  • Yes, it's SQL Server. It's a simple select * from table but you're right, it is taking a lot of time to run in SSMS. 10 min for 200k rows. I don't know my way around SQL query performance, should indexing be enough? I do have one varbinary column in the table. – Saphiros Jul 28 '19 at 20:27
  • Look at the client statistics in SSMS to see how much data is being moved. With 45 columns and a VARBINARY column, I suspect there is a lot of data. If that is not the case, look at the execution plan, then look at the server settings. But, I think this is now beyond the scope of this question. – RWRkeSBZ Jul 28 '19 at 20:33
  • Thanks. I'll look into it but i can't be exporting the table the way it is now. Thanks a lot. – Saphiros Jul 28 '19 at 20:39