1

All, I have a utility to export DataTables to Excel (into both .xls and .xlsx file types). When I reach a column or row limit on the DataTable (being pulled from SQL Server), I want to split the DataTable into children, each obaying the limits (for .xls 256 columns and 65,536 rows and .xlsx 16,384 columns and 1,048,576 rows).

So far, I have written the following method to do what I want

public static List<DataTable> SplitDataTable(DataTable mother, int nColLimit)
{
    List<int[]> rangeList = new List<int[]>();
    int primaryCols = mother.Columns.Count;
    int nSplitCount = Convert.ToInt32(primaryCols / nColLimit);
    int max = -1;

    // Get the child ranges.
    int tmpSup = 0;
    for (int splits = 0; splits < nSplitCount; splits++)
    {
        if (rangeList.Count == 0)
            tmpSup = (splits + 1) * (nColLimit - 1);
        else
            tmpSup = rangeList[splits - 1][1] + nColLimit;
        rangeList.Add(new int[2] { splits * nColLimit, tmpSup });
        if (max < tmpSup)
            max = tmpSup;
    }
    rangeList.Add(new int[2] { ++max, primaryCols });

    // Build child DataTables.
    List<DataTable> childList = new List<DataTable>();
    int childIndex = 0;
    foreach (int[] range in rangeList)
    {
        childList.Add(new DataTable());
        for (int i = range[0]; i < range[1]; i++)
            for (int j = 0; j < mother.Rows.Count; j++)
                childList[childIndex].Rows[j][i] = mother.Rows[j][i];
        childIndex++;
    }
    return childList;
}

This however, throws an indexOutOfRangeException with the message "There is no row at position 0.". I appreciate where the error is coming from but what is the best way to copy the complete columns from the mother to a child?

I have also tried

List<DataTable> childList = new List<DataTable>();
int childIndex = 0;
foreach (int[] range in rangeList)
{
    childList.Add(new DataTable());
    foreach(DataRow row in mother.Rows)
    {
        DataRow tmpRow = childList[childIndex].NewRow();
        for (int i = range[0]; i < range[1]; i++)
            tmpRow[i + 1] = row[i + 1];
    }
    childIndex++;
}

which give the same range exception.

Thanks for your time.

Edit: how I did this short term

foreach (int[] range in rangeList)
{
    childList.Add(new DataTable());
    string strSqlTmp =
    String.Format("declare @columns varchar(max) " +
        "select  @columns = case when @columns is null " +
        "then '' " +
        "else @columns + ', ' " +
        "end + name " +
             "from sys.columns " +
             "where object_id = object_id('{0}') and name in " +
                 "(SELECT COLUMN_NAME " +
                     "FROM [{1}].INFORMATION_SCHEMA.COLUMNS " +
                     "WHERE TABLE_NAME = N'{0}' " +
                     "AND ORDINAL_POSITION > {2} AND ORDINAL_POSITION < {3}) " +
        "declare @query varchar(max) " +
        "set @query = 'select ' + @columns + ' from {0}' " +
        "exec (@query);
    // Then get each DataTable from SQL Server and fill the child list...
MoonKnight
  • 23,214
  • 40
  • 145
  • 277

2 Answers2

1

The reason you're getting the IndexOutOfRange exception is because you're attempting to reference Columns in your new DataTable that don't exist. This line:

childList.Add(new DataTable()); 

does indeed add a new DataTable to childList, but that DataTable has no columns and rows.

Normally I might use the DataTable.Clone() method to create a new DataTable with the same structure as the DataTable the method was called on, but obviously this won't work for you. In your case you'll have to explicitly add DataColumns using the DataTable.Columns.Add method.

Something like:

for (int i = 0; i < numberOfColumns; i++) {
    dataTable.Columns.Add(string.Format("Column {0}", i));
}

The snippet above is very simplistic. Since you'll be creating the DataColumns yourself you'll need to name and type each DataColumn yourself.

Jay Riggs
  • 53,046
  • 9
  • 139
  • 151
  • Thanks very much for your time. In the mean time I skimped out and asked SQL Server to do it. I have edited my question to show how this was done if you interested... – MoonKnight May 23 '12 at 19:57
  • Glad you got it working and thanks for posting your solution. – Jay Riggs May 23 '12 at 20:12
  • can you please look into one of my [question](https://stackoverflow.com/questions/55688229/how-to-split-data-table-into-multiple-tables-with-adding-1-minute-delay) related to it? – Moeez Apr 16 '19 at 06:40
1

I just created a method for Splitting up a data table. The ".Batch" method is referenced from MoreLinq.

private static List<DataTable> SplitTable(DataTable originalTable, int batchSize)
        {
            List<DataTable> tables = new List<DataTable>();

            foreach (var rowBatch in originalTable.Rows.Cast<DataRow>().Batch(batchSize))
            {
                var batchTable = new DataTable(originalTable.TableName);

                foreach (DataColumn column in originalTable.Columns)
                    batchTable.Columns.Add(column.ColumnName, column.DataType);

                foreach (DataRow row in rowBatch)
                    batchTable.Rows.Add(row.ItemArray);

                tables.Add(batchTable);
            }
            return tables;
        }

Just in case this would be helpful to anyone :)

Stephan Møller
  • 1,247
  • 19
  • 39