2

I'm surprised that I haven't seen anything about this on here (or maybe I missed it). When parsing a CSV file, if there are rows with no data, how can/should that be handled? I'm not talking about blank rows, but empty rows, for example:

ID,Name,Quantity,Price
1,Stuff,2,5
2,Things,1,2.5
,,,
,,,
,,,

I am using TextFieldParser to handle commas in data, multiple delimiters, etc. The two solutions I've thought of is to either use ReadLine instead of ReadFields, but that would remove the benefits of using the TextFieldParser, I'd assume, because then I'd have to handle commas a different way. The other option would be to iterate through the fields and drop the row if all of the fields are empty. Here's what I have:

dttExcelTable = new DataTable();

using (TextFieldParser parser = new TextFieldParser(fileName))
{
    parser.Delimiters = new string[] { ",", "|" };

    string[] fields = parser.ReadFields();
    if (fields == null)
    {
       return null;
    }
    foreach (string columnHeader in fields)
    {
       dttExcelTable.Columns.Add(columnHeader);
    }

    while (true)
    {
        DataRow importedRow = dttExcelTable.NewRow();
        fields = parser.ReadFields();
        if (fields == null)
        {
            break;
        }
        for (int i = 0; i < fields.Length; i++)
        {
            importedRow[i] = fields[i];
        }

        foreach (var field in importedRow.ItemArray)
        {
            if (!string.IsNullOrEmpty(field.ToString()))
            {
                dttExcelTable.Rows.Add(importedRow);
                break;
            }
        }

    }
}
Relevant
  • 148
  • 1
  • 15

4 Answers4

3

Without using a thirdy party CSV reader you could change your code in this way

.....
DataRow importedRow = dttExcelTable.NewRow();
for (int i = 0; i < fields.Length; i++)
    importedRow[i] = fields[i];

if(!importedRow.ItemArray.All (ia => string.IsNullOrWhiteSpace(ia.ToString())))
     dttExcelTable.Rows.Add(importedRow);

Using the All IEnumerable extension you could check every element of the ItemArray using string.IsNullOrWhiteSpace. If the return is true you have an array of empty string and you could skip the Add

Steve
  • 213,761
  • 22
  • 232
  • 286
0

You can just replace commas in the line by nothing and test this if it is null.

strTemp = s.Replace(",", "");

if (!String.IsNullOrEmpty(strTemp)) { /*code here */}

http://ideone.com/8wKOVD

Andrew Paes
  • 1,940
  • 1
  • 15
  • 20
0

Doesn't seem like there's really a better solution than the one that I provided. I will just need to loop through all of the fields and see if they are all empty before adding it to my datatable.

The only other solution I've found is Steve's answer, which is to not use TextFieldParser

Relevant
  • 148
  • 1
  • 15
0

I know this is literally years later, but I recently had this issue and was able to find a workaround similar to previous responses. You can see the whole flushed out function

        public static DataTable CSVToDataTable(IFormFile file)
        {
            DataTable dt = new DataTable();
            using (StreamReader sr = new StreamReader(file.OpenReadStream()))
            {
                string[] headers = sr.ReadLine().Split(',');
                foreach (string header in headers)
                {
                    dt.Columns.Add(header);
                }

                var txt = sr.ReadToEnd();
                var stringReader = new StringReader(txt);
                TextFieldParser parser = new TextFieldParser(stringReader);
                parser.HasFieldsEnclosedInQuotes = true;
                parser.SetDelimiters(",");

                while (!parser.EndOfData)
                {
                    
                    string[] rows = parser.ReadFields();
                    string tmpStr = string.Join("", rows);
                    if (!string.IsNullOrWhiteSpace(tmpStr))
                    {
                        DataRow dr = dt.NewRow();
                        for (int i = 0; i < headers.Length; i++)
                        {
                            dr[i] = rows[i];
                        }
                        dt.Rows.Add(dr);
                    }                   
                }
            }
            return dt;
        }

It works for me and has proven fairly reliable. The main snippet is found in the WHILE loop after calling .ReadFields()--I join the returned rows to a string and then check if its nullorempty. Hopefully this can help someone who stumbles upon this.

Andrew
  • 1