0

I am reading the contents of the CSV to a string as below:

string csvData = string.Empty;

using (var reader = new System.IO.StreamReader(file.OpenReadStream()))
            using (ExcelPackage package = new ExcelPackage())
{
  csvData = reader.ReadToEnd();
  int totalLength = csvData.TrimEnd('|').Split('|').Length;
  string[] result = null;
  result = csvData.TrimEnd('|').Split('|');

  if (String.IsNullOrEmpty(result[totalLength-1].Replace(",", "").Trim()))
  {
     result = result.Take(result.Count() - 1).ToArray();
  } 

 //do some processing to the result here.
}   

So below is the contents of my sample csvData:

123,a,b,3|456,c,d,5|111,acd,55,c1|,,,,

If you see the sample above it does contain the last empty row as it comes out from CSV. To remove the above empty row I use the above code that I have posted.

This all works fine. But the issue comes when I have more than one empty row as example below:

123,a,b,3|456,c,d,5|111,acd,55,c1|,,,,|,,,,|,,,,

With the above input my code just removes one empty row.

What I want that the result should have no empty rows as below:

123,a,b,3|456,c,d,5|111,acd,55,c1

How can I remove all the empty rows from my array.

Thanks

user1563677
  • 713
  • 3
  • 15
  • 38
  • 3
    There are many great libraries such as CSVHelper which will not only parse your file, but store it as typed data as an IEnumerable. No fumbling with arrays required – Ňɏssa Pøngjǣrdenlarp Nov 21 '18 at 18:02
  • Encapsulate your logic for detecting an empty line into a method called IsEmpty. then File.ReadAllLines(filename).Where(x => !IsEmpty(x)) will give you all non-empty lines which you can then process. –  Nov 21 '18 at 18:04
  • @Disaffected1070452 I am dealing with both excel and csv files here. I had used EPPlus for excel but then that doesnt handles csv so I used this option. The only thing I can do if I dont want to use above code is to look at the file extension and if its excel use EPPlus else use csvhelper – user1563677 Nov 21 '18 at 18:06
  • @Will sorry could you give me an example please. – user1563677 Nov 21 '18 at 18:12
  • @Disaffected1070452 I just tried using CSVHelper but the same issue is there. I used this code: https://stackoverflow.com/questions/33294738/read-all-values-from-csv-into-a-list-using-csvhelper the first answer in the above post. There also it reading the empty line and adding to result. – user1563677 Nov 21 '18 at 18:23
  • Why not use replace before you parse the string? – Tony Dong Nov 21 '18 at 18:29

3 Answers3

2

I advise using CSVHelper. CSV is not such a simple format as it seems.

With CSVHelper you can do this:

using (var csv = new CsvReader(reader))
{
    csv.Configuration.SkipEmptyRecords = true;
    var records = csv.GetRecords<Foo>().ToArray();
}
smolchanovsky
  • 1,775
  • 2
  • 15
  • 29
0

I'm almost certain the people advising you to CSVHelper or some other tool are correct, but if I were to do it by hand and wasn't absurdly concerned with performance, I'd do it like this:

    private void func()
    {
        string input = "123,a,b,3|456,c,d,5|111,acd,55,c1|,,,,|,,,,|,,,,";

        List<IEnumerable<string>> parsedLines = new List<IEnumerable<string>>();
        foreach (string line in input.TrimEnd('|').Split('|')) //foreach row
            parsedLines.Add(line.Split(',')); //add that as a list of columns
        //select rows that have at least one column with text
        var result = parsedLines.Where(line => line.Any(field => !string.IsNullOrEmpty(field)));
    }

If the goal is to throw away the empty lines BEFORE turning each into a collection of columns, this would work:

        string input = "123,a,b,3|456,c,d,5|111,acd,55,c1|,,,,|,,,,|,,,,";

        var unparsedLines = input.TrimEnd('|').Split('|');
        Regex re = new Regex(@"[^,\s]", RegexOptions.Compiled); //search for any char that is not a comma or whitespace
        var result = unparsedLines.Where(o => re.Match(o).Success);
zzxyz
  • 2,953
  • 1
  • 16
  • 31
0

also

s = ",,,|1,2,3|,,,,|,,,|4,56|,,,|,,|,,,,,";
var sprev = s; string res;
while(true)
{
    var snew = Regex.Replace(sprev, "(\\||^),{2,}(\\||$)","|");
    if(snew == sprev) 
    {
        res = snew.Trim('|');
        break;
    }
    sprev = snew;
}
AndrewF
  • 33
  • 3