Way to export single sql table data(60gb) data to multiple csv files

Question

I am new to data, Apology if I am not clear with my question.

My requirement to export sql table to csv file, I used cmd.ExecuteReader in c# and loaded successfully for up to 1gb tables.

But we have couple of other tables with size up to 60 gb.

Is there any efficient way to export large table data to multiple csv files?

Appreciate for any suggestions.

Performance on this topic depends of the RDBMS... Is it Oracle, SQL Server, IBM Db2 ?... — SQLpro, Oct 20 '22 at 12:38
Welcome to StackOverflow. Please show us what you have so far for smaller files. Note this is *not* a script factory. You will need to write the script yourself. If you run into a problem that isn't yet covered by the existing questions and answers, you might post your [mcve] here so that we might be able to help you further. See also: [How to ask a good question](https://stackoverflow.com/help/how-to-ask). As a general answer to creating batches, see: [Read CSV sheet in batches using Powershell](https://stackoverflow.com/a/73774733/1701026) — iRon, Oct 20 '22 at 12:58

score 0 · Answer 1 · answered Oct 20 '22 at 12:44

0

You can export your table data in paginated way.You can use OFFSET, FETCH NEXT in your query

answered Oct 20 '22 at 12:44

Debaditya Mondal

26
5

Thanks for it, this helps for the tables which has primary key, Any recommendation for other that have not any key. – Srini Oct 20 '22 at 17:40

TwiXter · Accepted Answer · 2022-10-20T19:32:32.443

For your comment:

Any recommendation for other that have not any key

You could simply count the number of records, stop when you reach your limitation and create a new file:

    /// <summary>
    /// Exports a full Table to CSV file(s). Each file will contain maximum number of records specified in "maxNbOfRows" parameter
    /// </summary>
    /// <param name="connection">Database connection to be used</param>
    /// <param name="tableName">Name of the table</param>
    /// <param name="outputFilepath">Filme path (ie: "c:\Export_Table{0:00000}.csv")</param>
    /// <param name="outputFileEncoding"></param>
    /// <param name="maxNbOfRows"></param>
    /// <param name="addHeaderRow"></param>
    /// <param name="separator"></param>
    /// <returns></returns>
    public static async Task<List<string>> ExportToSplittedCsv(System.Data.Common.DbConnection connection, string tableName, string outputFilepath, Encoding outputFileEncoding, long maxNbOfRows, bool addHeaderRow = true, string separator = ";")
    {
        var command = connection.CreateCommand();
        command.CommandText = $"SELECT * FROM {tableName}";
        long totalRecords = 0;
        long fileCount = 0;
        List<string> filesCreated = new List<string>();
        string filePattern = calculateFilePattern(outputFilepath);
        using (var reader = await command.ExecuteReaderAsync())
        {
            if (reader == null || !reader.HasRows)
                return filesCreated;//Table is empty
            while (reader.HasRows)
            {
                string curFileName = string.Format(filePattern, ++fileCount);
                using (var writer = new System.IO.StreamWriter(curFileName, false, outputFileEncoding))
                {
                    totalRecords += await _exportToSplittedCsv(reader, writer, maxNbOfRows,addHeaderRow,separator);
                }
                filesCreated.Add(filePattern);
            }
        }
        //You can return totalRecords or the list of files created or event the fileCount if you prefer
        return filesCreated;
    }

    /// <summary>
    /// Checks if the given output file has already a placeholder for the counter. If not, will add "{0:00000}" in the path to be able to add the file counter in the final file path...
    /// </summary>
    /// <param name="path">Original path</param>
    /// <returns></returns>
    private static string calculateFilePattern(string path)
    {
        //The path already contains the Counter placeHolder on it
        if (path.Contains("{0"))
            return path;

        int extIndex = path.LastIndexOf('.');
        if (extIndex == -1)
            return path + "{0:00000}";
        else
            return path.Substring(0, extIndex) + "{0:00000}" + path.Substring(extIndex);
    }

    /// <summary>
    /// Exports Database until reader has no more records or until maxNbOfRows is reached
    /// </summary>
    /// <param name="reader">System.Data.Common.DbDataReader used to read data from database</param>
    /// <param name="writer">Writer that will write CSV content in the current file</param>
    /// <param name="maxNbOfRows">Maximum number of rows to write on the output file</param>
    /// <param name="addHeaderRow">When true: first row in the CSV file will contain the Column names</param>
    /// <param name="separator">CSV field separator</param>
    /// <returns></returns>
    private static async Task<long> _exportToSplittedCsv(System.Data.Common.DbDataReader reader, System.IO.StreamWriter writer, long maxNbOfRows, bool addHeaderRow , string separator )
    {
        long rowCount = 0;
        if (addHeaderRow)
            await writer.WriteAsync(getColNames(reader, separator));
        while (await reader.ReadAsync() && (maxNbOfRows < 1 || rowCount < maxNbOfRows))
        {
            await writer.WriteLineAsync(buildCsvRow(reader, separator));
            rowCount++;
        }
        await writer.FlushAsync();
        writer.Close();
        return rowCount;
    }

    /// <summary>
    ///  Builds CSV row containing the column names
    /// </summary>
    /// <param name="reader">System.Data.Common.DbDataReader used to read data from database</param>
    /// <param name="separator">CSV field separator</param>
    /// <returns></returns>
    private static string getColNames(System.Data.Common.DbDataReader reader, string separator)
    {
        var rowBuilder = new StringBuilder();
        for (int i = 0; i < reader.FieldCount; i++)
        {
            rowBuilder.Append(separator).Append(reader.GetName(i));
        }
        //We remove extra separator from the begin...
        return rowBuilder.Remove(0, 1).ToString();
    }

    /// <summary>
    /// Builds a CSV row containing a database row
    /// </summary>
    /// <param name="reader">System.Data.Common.DbDataReader used to read data from database</param>
    /// <param name="separator">CSV field separator</param>
    /// <returns></returns>
    private static string buildCsvRow(System.Data.Common.DbDataReader reader, string separator)
    {
        var rowBuilder = new StringBuilder();
        for (int i = 0; i < reader.FieldCount; i++)
        {
            //If you want to format DateTime to a specific value...
            /*switch (reader.GetDataTypeName(i))
            {
                case : //The typename changes depending of the Database type...
                    DateTime dt = reader.GetDateTime(i);
                    if (dt == null)
                        rowBuilder.Append(separator);
                    else
                        rowBuilder.Append(separator).Append(dt.ToString("yyyy-MM-dd HH:mm:ss"));
                    break;
                default:
                    rowBuilder.Append(separator).Append(reader.GetValue(i).ToString());
            }*/
            rowBuilder.Append(separator).Append(reader.GetValue(i).ToString());
        }
        //We remove extra separator from the begin...
        return rowBuilder.Remove(0, 1).ToString();
    }

This answer can be optimised with a Cancellation process. How to use it:

ExportToSplittedCsv(dbConnection,
            "customer_orders",
            "c:\\ExportedData\\customerOrders\\export_customerOrders{0:00000}.csv",
            System.Text.Encoding.Default, 60000, true, ";");

DbDataReader is the most efficient: it doesn't load the full records in the memory like a List<...> would do, that's why you navigate to next record using reader.Read . You have also Database managers that can do exportations depending of your Database. This can be achieved by some Database scripting language directly also — TwiXter, Oct 20 '22 at 19:35
This program working fine, but not getting all data to file,tried selct top 10 * from table And maximum rows given as 5. First file written 5 rows correctly, but second one wrote only 3 rows. Tried debugging but no idea still. — Srini, Oct 21 '22 at 22:28
Strange... are you using async/await? Maybe you are not waiting for the task to finish? the Flush is also important to finish data writing from buffer to file... Other than that i can't see any other issue. You can try without async/await. Unfortunately I'm out until monday pm so i can't help further. Of course i assume that your table as more than 10 records? — TwiXter, Oct 23 '22 at 07:35
I am still blind with this, created another quesion with my program , please see my program below link, and suggest https://stackoverflow.com/questions/74184056/c-sharp-program-to-read-sql-to-csv-files — Srini, Oct 24 '22 at 16:23
Please @Srini mark any of the answers accepted since both questions are apparently fixed. — TwiXter, Nov 15 '22 at 10:21

Way to export single sql table data(60gb) data to multiple csv files

2 Answers2