31

I have a CSV file with field headers and some of them contain two or three words separated by spaces:

Screenshot of the first few rows of a spreadsheet in a spreadsheet application, including headers that contain spaces, as described above.

You can see in the above picture the field headers that contain spaces:
"Time of Day", "Process Name", and "Image Path".

When I tried to read the CSV by calling reader.GetRecords<DataRecord>(); (where DataRecord is a class I have defined), I get the error:

Fields 'TimeOfDay' do not exist in the CSV file."*

This is because my DataRecord class can't contain members with spaces.

How can I use CsvHelper to parse the CSV file?

Mark Amery
  • 143,130
  • 81
  • 406
  • 459
E235
  • 11,560
  • 24
  • 91
  • 141

3 Answers3

43

Based on CsvHelper Documentation, there are several ways that we can achieve our desired results.

1. Ignore White Space from Headers (which I believe should solve your problem easily)

In CsvHelper 3 or later, use PrepareHeaderForMatch (documented at http://joshclose.github.io/CsvHelper/configuration#headers) to remove whitespace from headers:

csv.Configuration.PrepareHeaderForMatch =
    header => Regex.Replace(header, @"\s", string.Empty)

In CsvHelper 2, set the IgnoreHeaderWhiteSpace flag which tells the reader to ignore white space in the headers when matching the columns to the properties by name.

reader.Configuration.IgnoreHeaderWhiteSpace = true;

2. Read Manually

We can read each field manually like:

var reader = new CsvReader(sr);
do
{
    reader.Read();                   
    var record=new DataRecord();

    record.TimeOfDay=reader.GetField<string>("Time of Day");
    record.ProcessName=reader.GetField<string>("Process Name");
    record.PID=reader.GetField<string>("PID");
    record.Operation=reader.GetField<string>("Operation");
    record.Path=reader.GetField<string>("Path");
    record.Result=reader.GetField<string>("Result");
    record.Detail=reader.GetField<string>("Detail");
    record.ImagePath=reader.GetField<string>("Image Path");

} while (!reader.IsRecordEmpty());

3. Class Mapping:

We can manually map between our class's properties and the headings in the CSV file using name class mapping like this:

public sealed class DataRecordMap:CsvClassMap<DataRecord>
{
    public DataRecordMap()
    {
         Map( m => m.TimeOfDay).Name("Time Of Day");
         Map( m => m.ProcessName).Name("Process Name");
         Map( m => m.PID).Name("PID");   
         Map( m => m.Operation).Name("Operation");    
         Map( m => m.Path).Name("Path");
         Map( m => m.Result).Name("Result");
         Map( m => m.Detail).Name("Detail");
         Map( m => m.ImagePath).Name("Image Path");
     }
}

Then we should register it using:

reader.Configuration.RegisterClassMap<DataRecordMap>();
Mark Amery
  • 143,130
  • 81
  • 406
  • 459
user3473830
  • 7,165
  • 5
  • 36
  • 52
  • 2
    While it looks good, it seems Method 1 does not work (anymore). The expected function now takes two parameters. After adding one, it however seems the returned values are not made use of, the method has no effect. – kjyv Mar 08 '19 at 16:01
  • 3
    @kjyv I got it working using `csv.Configuration.PrepareHeaderForMatch = (header, index) => Regex.Replace(header, @"\s", string.Empty);` – fuzzy_logic Mar 24 '19 at 09:41
  • Just curious if anyone has written a "column name mismatch debugger" so that instead of throwing an error saying it couldnt find a match, it tried to find any columns matching any of the words in the map. For example, I have a column named "Rating Agency" and got an error nothing existed. While I want it to fail hard in development, I would also LOVE if in development it said, "Oh, hey John, the following columns partially match: "Rating Agencies, Ratings" – John Zabroski May 17 '19 at 15:09
  • Method 1 should use class PrepareHeaderForMatchArgs so example sould look like this: `csv.Configuration.PrepareHeaderForMatch = header => Regex.Replace(header.Header, @"\s", string.Empty)` – Boguslaw Aug 16 '21 at 12:02
34

The library supports attributes now. You would want to use the Name attribute.

using CsvHelper.Configuration.Attributes;

public class DataRecord
{
    [Name("Time of Day")]
    public string TimeOfDay { get; set; }

    [Name("Process Name")]
    public string ProcessName { get; set; }

    public string PID { get; set; }
    public string Operation { get; set; }
    public string Path { get; set; }
    public string Result { get; set; }
    public string Detail { get; set; }

    [Name("Image Path")]
    public string ImagePath { get; set; }

    public static IEnumerable<DataRecord> ParseDataRecords(Stream file)
    {
        using (var sr = new StreamReader(file))
        using (var csv = new CsvReader(sr))
        {
            foreach (var record in csv.GetRecords<DataRecord>())
            {
                yield return record;
            }
        }
    }
}
JPK
  • 683
  • 6
  • 7
2

After lots of struggling with this, here's what worked for me. I"m using version 3.

        var config = new CsvConfiguration(CultureInfo.InvariantCulture)
        {
            PrepareHeaderForMatch = args => args.Header.Replace(" ","")
        };

        using (var reader = new StreamReader("my_csv_file.csv"))
        using (var csv = new CsvReader(reader, config))

I believe I'm creating a configuration (config) that appends or modifies InvariantCulture and uses the Replace function in C# to remove the spaces. So far it's working for me.