First I build a list (by reading existing files) of approximately 12,000 objects that look like this:
public class Operator
{
string identifier; //i.e "7/1/2017 MN01 Day"
string name1;
string name2;
string id1;
string id2;
}
The identifier will be unique within the list.
Next I run a large query (currently about 4 million rows but it could be as large as 10 million, and about 20 columns). Then I write all of this to a CSV line by line using a write stream. For each line I loop over the Operator
list to find a match and add those columns.
The problem I am having is with performance. I expect this report to take a long time to run but I've determined that the file writing step is taking especially long (about 4 hours). I suspect that it has to do with looping over the Operator
list 4 million times.
Is there any way I can improve the speed of this? Perhaps by doing something when I build the list initially (indexing or sorting, maybe) that will allow searching to be done much faster.