1

I am experiencing a strange issue occurring when I invoke the Count() method on a IEnumerable.

My code is simply reading a csv file using the CSV Helper package. I can loop through all the csv records and output the information of each record - until I attempt to get the count before the loop:

StreamReader streamReader = new(_fileName, Encoding.GetEncoding("iso-8859-1"));
CsvReader csvReader = new(streamReader, csvConfig);
            
IEnumerable<WebApi.Entities.Csv.Transaction> records = csvReader.GetRecords<WebApi.Entities.Csv.Transaction>();

System.Diagnostics.Debug.WriteLine($"CSV Record Count: {records.Count()}");   //this line gets the count

foreach (var record in records)
{
    System.Diagnostics.Debug.WriteLine(record.Description);
}

When the count statement is present, the foreach loop never executes, it is as though the records var becomes empty after the count. If I remove the Count statement then the iterations over the loop occur without problems.

I also found that if I run the count statement twice, the first statements returns the count, but the second returns 0:

System.Diagnostics.Debug.WriteLine($"CSV Record Count: {records.Count()}");  // returns correct count
System.Diagnostics.Debug.WriteLine($"CSV Record Count: {records.Count()}");  // returns 0

What is going on here and how can I fix it?

Musaffar Patel
  • 905
  • 11
  • 26
  • 2
    This is probably a feature/bug of the CSV reader's `IEnumerable` implementation. What `CsvReader` is this? [This one](https://joshclose.github.io/CsvHelper/)? – Sweeper Sep 17 '22 at 11:38
  • 1
    It's a "feature". `CsvReader.GetRecords` returns an IEnumerable that will read the provided TextReader in a forward fashion upon iteration. Iterating this IEnumerable fully means reading the entire TextReader to its end. And then theTextReader is at its end, and trying to iterate over this IEnumerable again won't yield any records anymore. It's a poor behavior for a feature, but it is what it is. The online documentation sucks, it's basically "_the code is the documentation_", with the code viewable on Github: https://github.com/JoshClose/CsvHelper –  Sep 17 '22 at 11:46
  • 2
    If you need the `Count` you have to enumerate the whole file anyway, if it's not too large it might be a viable workaround to simple read them into a `List()>` by calling `var allRecords = csvReader.GetRecords().ToList()`. Then you can safely use `Count` and enumerate it afterwards. – Tim Schmelter Sep 17 '22 at 11:47
  • 1
    @TimSchmelter - Thanks, this solves the issue for me. I'm very new to c# so am I correct in my understanding that by reading it into a LIst - we are reading the whole file into memory - which is not the case with the IEnumerable? – Musaffar Patel Sep 17 '22 at 11:52
  • 1
    @MusaffarPatel: Yes, if you use `ToList` or `ToArray` you populate a collection with the result. `IEnumerable` is just the interface which also `List` implements. Some methods use a feature called deferred execution which [`yield`](https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/yield)s the items. If that's the case the `IEnumerable` is just the _query_, the instruction, to retrieve items from a sequence. It gets executed at the `Count` or `ToList` or `foreach`(and many other methods). Since `GetRecords` uses a forward TextReader, it can only be used once. – Tim Schmelter Sep 17 '22 at 11:54
  • @MusaffarPatel: Unfortunately the documentation of `GetRecords` is poor, but at least the keyword `yield` was mentioned: https://joshclose.github.io/CsvHelper/getting-started/. Whenever you read `yield` or deferred execution in a method documentation you should remember that this method is not executed directly. So whenever you use it, you will trigger the whole query again. That can be a performance issue, then better fill a collection if you need to consume it more than once. So even if it would work in this case, you had a performance issue, because the file had to be processed twice. – Tim Schmelter Sep 17 '22 at 11:59

0 Answers0