1

I have a file with int values in each line (although it's possible that some values are not ints like some comments). But the structure of the file is:

1
2
3
4
5
6
7
#some comment
9
10
etc...

What's the fastest way to convert it to IEnumerable. I could read line by line and use List and call Add method, but I guess it's not the best in terms of performance.

Thanks

dragonfly
  • 17,407
  • 30
  • 110
  • 219

4 Answers4

4

You could create your IEnumerable on-the-fly while reading the file:

IEnumerable<Int32> GetInts(string filename)
{
    int tmp = 0;
    foreach(string line in File.ReadLines(filename))
        if (Int32.TryParse(line, out tmp))
            yield return tmp;
}

This way, you can do whatever you want to do with your integers while reading the file, using a foreach loop.

foreach(int i in GetInts(@"yourfile"))
{
    ... do something with i ...
}

If you just want to create a list, simply use the ToList extension:

List<Int32> myInts = GetInts(@"yourfile").ToList();

but there probably won't be any measurable performance difference if you "manually" create a list as you described in your question.

sloth
  • 99,095
  • 21
  • 171
  • 219
2
var lines = File.ReadLines(path).Where(l => !l.StartsWith("#"));   

you can also append .Select(x => int.Parse(x))

L.B
  • 114,136
  • 19
  • 178
  • 224
  • I'd suggest `File.ReadLines` instead of `File.ReadAllLines`, for lazy loading. – Kevin Gosse Aug 31 '12 at 09:33
  • are you sure this is faster than using a loop? – Massimiliano Peluso Aug 31 '12 at 09:35
  • @MassimilianoPeluso you are doing Disk IO and sequential read, How much improvement do you expect? – L.B Aug 31 '12 at 09:36
  • @MassimilianoPeluso but it is using a loop. It's likely to be slightly slower, but so slightly that it gets lost compared to the IO. While I went to the loop approach, I certainly wouldn't argue one over the other on performance, as long as it works sequentially, which it does. – Jon Hanna Aug 31 '12 at 09:42
  • 1
    @L.B I was not talking about the Disk IO which it is not possible to gain in performance but I was talking about using LINQ inestead of a while/for which is always faster than LINQ. In the past I has a similar problem but the file was contain million of int.I could not improve the IO access but using a classic for was lots more faster than LINQ – Massimiliano Peluso Aug 31 '12 at 10:30
  • @MassimilianoPeluso Linq only brings an overhead of a few more function call, which is nothing when compared to Disk IO. I don't know what you did in the past, but I think you can easily test it.(You can even ask it here on SO) – L.B Aug 31 '12 at 10:39
1
public static IEnumerable<int> ReadInts(TextReader tr)
{
  //put using here to have this manage cleanup, but in calling method
  //is probably better
  for(string line = tr.ReadLine(); line != null; line = tr.ReadLine())
    if(line.Length != 0 && line[0] != '#')
      yield return int.Parse(line);
}

I assume from your description that a line that doesn't match should throw an exception, but I guessed also that blank lines where you don't want them are very common, so I do cathc that case. Adapt to catch that as appropriate otherwise.

Jon Hanna
  • 110,372
  • 10
  • 146
  • 251
0

If you want to add lines only if they are convertible to ints, you could use int.TryParse. I suggest to use File.ReadLines instead of File.ReadAllLines(creates an array in memory):

int value;
IEnumerable<String>lines = File.ReadLines(path)
    .Where(l => int.TryParse(l.Trim(), out value));   

or (if you want to select those ints):

int value;
IEnumerable<int>ints= File.ReadLines(path)
    .Where(l => int.TryParse(l.Trim(), out value))
    .Select(l => value);  
Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939