0

I have the following loops which are iterating for a long time, the queryResult has 397464 rows and each row has 15 columns, so the number of iterations will be 397464*15 = 5961960 + outer loop (397464) = 6359424 iterations.

The problem is that this is taking a very long time resulting page timeouts. Could this be written in a more efficient way?

var rowHtml = String.Empty;

foreach (DataRow row in queryResult.Rows)
{
    rowHtml += "<tr>";
    for (int i = 0; i < queryResult.Columns.Count; i++)
    {
        rowHtml += $"<td>{row[i]}</td>";
    }
    rowHtml += "</tr>";
}
AStopher
  • 4,207
  • 11
  • 50
  • 75
  • 6
    You can dramatically reduce execution time and memory consumption by using [StringBuilder](https://learn.microsoft.com/en-us/dotnet/api/system.text.stringbuilder?view=netcore-3.1) instead of concatenating strings like this. – Zohar Peled Oct 22 '20 at 10:50
  • 1
    A simple idea is to use `StringBuilder` to create your html. It may show you lower times. Also as all rows are independent you may want to process them in parallel, for example, with `Parallel.For`. – Sergey.quixoticaxis.Ivanov Oct 22 '20 at 10:50
  • You might also want to use `string.Join` (but that might prove to be less efficient than string builder). – Zohar Peled Oct 22 '20 at 10:56
  • 8
    What are you planning to do with this HTML? You are surely not going to try and display all 397464 rows to your user at the same time? – John M Oct 22 '20 at 11:00
  • _"397464*15 = 5961960 + outer loop (397464) = 6359424 iterations."_ - you mean concatenations. And then it's 5961960 + 2 * outer loop (397464) ... For the inner loop body you "only" have column*row = 15 * 397464 _iterations_. – Fildor Oct 22 '20 at 11:31
  • ... "resulting page timeouts" ... why are you constructing html like that in the first place? Is this ASP.NET (MVC)? – Fildor Oct 22 '20 at 11:35
  • Perhaps the best choice is to not use that code at all. No browser will display a 400K-row table. No application would want to read an HTML table when instead of eg a CSV or Excel file. What are you trying to do???? – Panagiotis Kanavos Oct 22 '20 at 11:35
  • Are you trying to fake an Excel file? Users hate that. It's also a lot easier to create *real* Excel files using a library like eg `EPPlus` or `ClosedXML`, without installing Excel on the server. You can create a sheet with a single `sheet.Cells.LoadFromDataTable(queryResults)` or `LoadFromCollection`, `LoadFromDataReader` – Panagiotis Kanavos Oct 22 '20 at 11:39
  • An `xlsx` file is a ZIP package containing XML files. It will be *smaller* than the HTML file you try to construct. – Panagiotis Kanavos Oct 22 '20 at 11:40

1 Answers1

-1

Building string: Consider using a StringBuilder. Every time you concatenate strings using the + operator, a new string is created on the heap. This is fine for individual uses, but can be a major slowdown in large workloads like yours. You can specify the StringBuilder's maximum and starting capacities in the constructor, giving you more control over the app's memory usage.

Parallelization: I do not know your app's exact context, but I suggest having a look at the System.Threading.Parallel class. Its For/Foreach methods allow you to iterate over a collection using a threadpool, which can greatly accelerate processing by shifting it to multiple cores.

Be careful though: If the order of elements is relevant, you should divide the workload into packages instead and build substrings for each of those.

Edit: Correction: String concatenation can only truly be parallelized in some rare cases where the exact length of each substring produced by the loop is fixed and known. In this special case, it is possible to write results directly to a large pre-allocated destination buffer. This is perfectly viable when working with char arrays or pointers, but not advisable with normal C# strings or StringBuilders.

Asynchronous Processing: It looks like you are writing some kind of web app or server backend. If your content is required on demand, and does not need to be ready the exact moment the page is loaded, consider displaying a loading bar or some notification along the lines of "please wait", while the page waits for your server to send the finished processing results.

Edit: As suggested in comments, there are better ways to solve this issue than constructing the HTML string from a table. Consider using those instead of some elaborate content loading scheme.

Railbert
  • 16
  • 1
  • Everything after the string builder is wrong and would actually *reduce* performance. All this loop does is concatenate strings. There are no asynchronous operations or anything to parallelize – Panagiotis Kanavos Oct 22 '20 at 11:36
  • @PanagiotisKanavos: May I ask why exactly parallelization would be a bad idea here? The question asked for a faster or more efficient way of doing the operation, and Parallel.For should be able to speed up the operation by dividing it into parts and processing those simultaneously. I know that there are cases (especially in very small workloads) where Parallel.For would just generate unnecessary overhead, but this does not look like that kind of situation to me. Could I ask you to point my mistake? – Railbert Oct 22 '20 at 12:49
  • Because there's nothing to parallelize. The *only* operation is string concatenation, which can't be parallelized without some tricks - like using `Aggregate` with a `StringBuilder` as both the loop and final state. Which then leads to a *lot* of buffer reallocations, so one would have to specify a capacity in advance. But doing that will *prevent* the web server from serving other requests – Panagiotis Kanavos Oct 22 '20 at 12:51
  • Besides, if you check what the OP tries to do, it looks like the real question is how to export 300K rows to Excel in a web application. Doing this properly would be both faser and have less impact on the web app than parallelization. Using a DbDataReader instead of a DataTable would be faster too, as string concatenation is *faster* than reading data from the database. The entire operation could finish right after the last row is read – Panagiotis Kanavos Oct 22 '20 at 12:55
  • You are right, that was indeed a major blunder on my side. The only sensible case where parallelization would be possible is if all substrings had a known fixed length, and if we were writing to a char array instead.I will edit my answer to fix that. – Railbert Oct 22 '20 at 12:59
  • Thanks both, I have used stringbuilder it was really fast but i have got another problem where i need to convert this stringbuilder variable "rowHtml" to a string, i just written rowHtml.ToString(), it immediately throwing a systemoutofmemoryexception due to a massive amount of data, Could you please help out of this and suggest any solution. Thanks – Sharukh Thunder Oct 22 '20 at 15:32
  • StringBuilder.ToString() is indeed what you are looking for, but your string may be too large for one single contiguous buffer. It is possible that your PC does not have enough free memory to store it all at once. In other words: Your string is too long, make it shorter or cut it into pieces. I am not too familiar with your use-case so this might be a bad idea, but consider writing the string to a file stream instead. That *may* alleviate the memory load by throwing it at the HDD instead. That sounds like a mess though, so you may want to rethink your entire approach to the problem instead, – Railbert Oct 23 '20 at 12:02