0

I'm using CsvHelper to read a CSV file.

This is my code (pretty simple):

using (var reader = new StreamReader("example.csv"))
using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture))
{
    var records = csv.GetRecords<CsvData>();
    int i = 0;
    foreach (var record in records)
    {                    
        i++;
        Console.WriteLine($"Processed {i}/{records.Count()} records."); 
    }
}
Console.WriteLine("Script finished");

The problem is that my code is not looping that foreach, so it won't print anything... I put a breakpoint in i++; line but it doesn't breaks.

If I print records.Count() it will return 3:

enter image description here

This could be an example of a CSV file:

enter image description here

Code format so you can copy it:

Size,Color
8,Yellow
2,Orange
13,Blue

And this could be an example of class CsvData:

public class CsvData
{        
    public decimal? Size { get; set; }
    public string Color { get; set; }
}

How should I iterate my rows parsing into my CsvData class creating a List<CsvData> or similar?

kuhi
  • 531
  • 5
  • 23

2 Answers2

1

@Joel Coehoorn is correct. As soon as you call .Count(), you have just told CsvHelper to read the whole CSV file in order to find out how many records there are in the file. You are now at the end of the data stream and there are no more records left to read. Calling .ToList() does the same thing. It reads the whole CSV file, but this time it saves the records to memory in the records variable. This is fine if your file is smaller, but you could run into memory issues if you have a very large file.

Per the Getting Started Instructions

The GetRecords<T> method will return an IEnumerable<T> that will yield records. What this means is that only a single record is returned at a time as you iterate the records. That also means that only a small portion of the file is read into memory. Be careful though. If you do anything that executes a LINQ projection, such as calling .ToList(), the entire file will be read into memory. CsvReader is forward only, so if you want to run any LINQ queries against your data, you'll have to pull the whole file into memory. Just know that is what you're doing.

Option 1

You have already discovered that you can call List<CsvData> records = csv.GetRecords<CsvData>().ToList(); and bring all the records into memory. Just understand that is what you are doing. I would also put your count into a variable var count = records.count(); instead of making your code loop through the List<CsvData> each time to get the count.

Option 2

Don't get the count at the beginning. Just give a total at the end.

Option 3

Loop through the file twice. Once to get the count and the 2nd time to get the data.

void Main()
{
    var count = 0;

    using (var reader = new StreamReader("example.csv"))
    using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture))
    {
        count = csv.GetRecords<CsvData>().Count();
    }
    
    using (var reader = new StreamReader("example.csv"))
    using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture))
    {
        var records = csv.GetRecords<CsvData>();
        int i = 0;
        foreach (var record in records)
        {
            i++;
            Console.WriteLine($"Processed {i}/{count} records.");
        }
    }
}

public class CsvData
{
    public int Size { get; set; }
    public string Color { get; set; }
}
David Specht
  • 7,784
  • 1
  • 22
  • 30
-1

Converting the collection to list worked:

enter image description here

Just by:

List<CsvData> records = csv.GetRecords<CsvData>().ToList();

Result of code (for lazy people)

using (var reader = new StreamReader("example.csv"))
using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture))
{
    var records = csv.GetRecords<CsvData>().ToList();
    int i = 0;
    foreach (var record in records)
    {                    
        i++;
        Console.WriteLine($"Processed {i}/{records.Count()} records."); 
    }
}
Console.WriteLine("Script finished");
kuhi
  • 531
  • 5
  • 23
  • 1
    This forces everything to load into memory all at once, which kind of defeats the purpose of using streams, but I guess if it solves your problem.. – Joel Coehoorn Mar 24 '22 at 20:37
  • While this may have solved your problem, it doesn't explain to future readers ***how*** it solved your problem. You should put in some further explanation. – Enigmativity Mar 24 '22 at 21:40
  • @Enigmativity I've shared the full code + explained the fix with code included + example of result, what else can I do? I just added the full code result to make more obvious but I think it was already shared – kuhi Mar 25 '22 at 10:46
  • @kuhi - My point is that you haven't explained the fix. You've only shown the fix. David's answer explains why your original code failed and how the `.ToList()` fixes it. It's a better answer because it clearly articulates why this approach works. – Enigmativity Mar 25 '22 at 11:15
  • oh but that was edited after my answer... I would not post my answer in that case but it wasn't including the `.ToList()` solution at the moment I've edited it – kuhi Mar 25 '22 at 11:47