Parse CSV file in C# - skip any row that does not match one of two IF conditions

Question

I am using the following C# code to parse a csv file, after which the results will be exported to a SQL Server database.

What I am trying to do with the if statement is say "if the value in column 1 is 1, parse that row one way, if the value in column 1 is 2, parse that row another way..." what I then need to add, where I have the comment in the code below, is to say "otherwise, just skip that row of the csv file."

 public List<Entity> ParseCsvFile(List<string> entries, string urlFile)
        {
            entries.RemoveAt(entries.Count - 1);
            entries.RemoveAt(0);
            List<Entity> entities = new List<Entity>();
            foreach (string line in entries)
            {
                Entity CsvFile = new Entity();
                string[] lineParts = line.Split(',');

                if (lineParts[1] == "1")
                {
                    CsvFile.Identifier = $"{lineParts[2]}";
                    CsvFile.SourceId = $"{lineParts[3]}";
                    CsvFile.Name = $"{lineParts[5]} {lineParts[6]} {lineParts[7]} {lineParts[8]} " +
                      $"{lineParts[9]} {lineParts[10]}";
                    entities.Add(CsvFile);
                }
                else if (lineParts[1] == "2")
                {
                    CsvFile.Identifier = $"{lineParts[11]}";
                    CsvFile.SourceId = $"{lineParts[12]}";
                    CsvFile.Name = $"{lineParts[13]} {lineParts[14]} {lineParts[15]};
                    entities.Add(CsvFile);
                }

            //Need to put code here that says "otherwise, skip this line of the CSV file."

            }
            return entities;
        }

Aren't you ignoring it already? If the first item is 1 or 2, you process it and add it to `entities`. If it isn't, you don't. I don't understand what the issue is. — , Sep 26 '19 at 13:25
You can use CsvReader and implement ShouldSkipRecord to make condition, more information [here](https://stackoverflow.com/questions/57990749/how-to-ignore-empty-rows-in-csv-when-reading/57990923#57990923) — Phat Huynh, Sep 26 '19 at 13:27
That aside, `Entity CsvFile` is a bit confusing. Wouldn't `Entity entity` be clearer? The name `CsvFile` looks like a class name and implies it represents an entire file rather than one entity from that file. — , Sep 26 '19 at 13:30
@Amy regarding your first comment, for most of the rows I want to skip (not all, but most) the value for `lineParts[1]` is just blank. It is causing an `Unhandled Exception` with the current code structure when it encounters such a row. — Stpete111, Sep 26 '19 at 13:37
@Stpete111 That's very relevant information that needs to be in the question. What is the length of `lineParts` when that happens? What specific exception type is being thrown, and from which line? — , Sep 26 '19 at 13:39

score 2 · Accepted Answer · answered Sep 26 '19 at 15:44

Based on this comment, I infer that at least part of your problem is that it isn't that you are having trouble with the syntax of the if statements, but rather that the element you're looking for in the array simply doesn't exist (e.g. if the whole row is blank, or at least has no commas).

Assuming that's the case, then this approach would be more reliable (this will ignore lines that don't have a second field, as well as those where the field doesn't contain an integer value, in case that was yet another issue you might have run into at some point):

if (lineParts.Length < 2 || !int.TryParse(lineParts[1], out int recordType))
{
    continue;
}

if (recordType == 1)
{
    CsvFile.Identifier = $"{lineParts[2]}";
    CsvFile.SourceId = $"{lineParts[3]}";
    CsvFile.Name = $"{lineParts[5]} {lineParts[6]} {lineParts[7]} {lineParts[8]} " +
      $"{lineParts[9]} {lineParts[10]}";
    entities.Add(CsvFile);
}
else if (recordType == 2)
{
    CsvFile.Identifier = $"{lineParts[11]}";
    CsvFile.SourceId = $"{lineParts[12]}";
    CsvFile.Name = $"{lineParts[13]} {lineParts[14]} {lineParts[15]};
    entities.Add(CsvFile);
}

For what it's worth, an expression like $"{lineParts[2]}" where the lineParts is already a string[] is pointless and inefficient. And the string.Join() method is helpful if all you want to do is concatenate string values using a particular separator. So, your code could be simplified a bit:

if (lineParts.Length < 2 || !int.TryParse(lineParts[1], out int recordType))
{
    continue;
}

if (recordType == 1)
{
    CsvFile.Identifier = lineParts[2];
    CsvFile.SourceId = lineParts[3];
    CsvFile.Name = string.Join(" ", lineParts.Skip(5).Take(6));
    entities.Add(CsvFile);
}
else if (recordType == 2)
{
    CsvFile.Identifier = lineParts[11];
    CsvFile.SourceId = lineParts[12];
    CsvFile.Name = string.Join(" ", lineParts.Skip(13).Take(3));
    entities.Add(CsvFile);
}

Finally, consider not trying to parse CSV with your own code. The logic you have implemented will work only for the simplest examples of CSV. If you have complete control over the source and can ensure that the file will never have to do things like quote commas or quotation mark characters, then it may work okay. But most CSV data comes from sources outside one's control and it's important to make sure you can handle all the variants found in CSV. See Parsing CSV files in C#, with header for good information on how to do that.

Parse CSV file in C# - skip any row that does not match one of two IF conditions

1 Answers1