2

I need to handle very large datatables (2 million rows+) that comes from databases (SQL, Oracle, Access, MySQL, Sharepoint etc) outside of my control: Currently I loop through every row and column building a string object, but I run out of memory at about 100k rows.

The only solution I may take is to break the datatable into smaller pieces and persisting each block before starting on the next block of rows.

Since I cannot add ROW_NUMBER() or anything similar, I have to handle the populated datatable.

How can I easily (keep performance in mind) break the populated datatable into smaller datatables like paging?

PS there is no visual component to this functionality.

callisto
  • 4,921
  • 11
  • 51
  • 92

2 Answers2

1

Are you using string concatenation? like this string += string.

Change that to StringBuilder and you should not have problems, at least not for 20k rows.

SWeko
  • 30,434
  • 10
  • 71
  • 106
  • 1
    I have marked yours as the correct answer, but in truth it was not the complete answer. I ended up breaking the incoming datatable into 500 row batches. – callisto Aug 02 '11 at 09:25
  • Tnx. It was just my knee-jerk reaction when someone mentions strings and performance :) I'm just curious, you did not ellaborate why you would want a string representation of 100k rows. Also, if you could show some code, we might be able to help further.. – SWeko Aug 02 '11 at 11:00
  • refer to this Q: http://stackoverflow.com/questions/5388682/creating-an-etl-system-data-import-and-transformation for an overview of the (now complete) project. MySQL is just a new datasource being added – callisto Aug 02 '11 at 11:24
0

If you are talking about filling a DataTable object (which loads the results of your calls into memory before processing), you will likely be better off using a datareader for each of the mentioned providers so then you can process each row as it is read from the database instead of storing the DataTable in memory...

A great answer to another question lists the pro/cons of datareaders/datatables

If you're already using datareaders- ignore this. But your memory problem might be from also storing the retrieved results...

Community
  • 1
  • 1
davidsleeps
  • 9,393
  • 11
  • 59
  • 73