0

I have a csv that looks like:

Column1,Column2,Column3,Column4,Column5,Column6,Column7,Column7,Column8,Column9,
45.50334645,5640192,3915776,52633600,351924224,12354,90505216,78790656,247287808, 
39.23091283,5640192,3915776,52633600,349986816,4562,90505216,78790656,247287808, 
25.26042,5640192,3915776,52633600,349986816,   ,90505216,78790656,247287808,

I need to get the MIN, MAX, and Average from each column. I am using LINQ to do this since the CSV's can be quite large.

Here is the current code I have.

var lines = System.IO.File.ReadAllLines(csvPath);
var columns = lines[0].Split(',');
for (int i = 1; i < columns.Count(); i++)
{
    var columnQuery = from line in lines
                        let elements = line.Split(',')
                        select Convert.ToDouble(elements[i]);

    var results = columnQuery.ToList();
    var min = results.Min();
    var max = results.Max();
    var avg = results.Average();
}

This will error out on the highlighted value in the csv since it is considered a Datetime.

The error I get is "Input string was not in a correct format."

Thanks for any help.

MarcinJuraszek
  • 124,003
  • 15
  • 196
  • 263
  • 1
    I don't get it. There's a DateTime in there? Why not use `DateTime.TryParse` than? – Steven Aug 18 '14 at 20:00
  • And if the file is large, use `File.ReadLines` instead. That will return an enumerator and will not load the whole file at once (what `File.ReadAllLines` does do). – Steven Aug 18 '14 at 20:00
  • It doesn't have a datetime. It just thinks it is because it is a blank field. – user1241394 Aug 18 '14 at 20:04

2 Answers2

1

You can use ternary operator:

var columnQuery = from line in lines
                  let elements = line.Split(',')
                  select string.IsNullOrWhiteSpace(elements[i]) ? 0. : Convert.ToDouble(elements[i]);
MarcinJuraszek
  • 124,003
  • 15
  • 196
  • 263
  • This works. I was trying to mimic excel functions of MIN, MAX, and Average. And I just looked and saw they also use 0 if the data is missing. Thanks. – user1241394 Aug 18 '14 at 20:11
0

Have you tried using Double.TryParse instead of Convert.ToDouble to avoid exception or to explicitly filter out empty rows with where elements[i].Trim() != "" before select?

Krizz
  • 11,362
  • 1
  • 30
  • 43