5

Using LINQ, what is an efficent way to get each string from a tab-delimited .txt file (and then get each word, usually what string.Split(...) does)?

var v = from line in File.ReadAllLines()
   select n

Is part of this solution I believe. I don't mind if this uses yield return.

EDIT: I've also seen threads on here detailing exactly what I am trying to do, but can't find them.

GurdeepS
  • 65,107
  • 109
  • 251
  • 387

2 Answers2

7

I'm not entirely sure what you're asking but it sounds like you're trying to get every word from a tab delimited file as an IEnumerable<string>. If so then try the following

var query = File.ReadAllLines(somePathVariable)
                .SelectMany(x => x.Split(new char[] { '\t' });
Mehrdad Afshari
  • 414,610
  • 91
  • 852
  • 789
JaredPar
  • 733,204
  • 149
  • 1,241
  • 1,454
  • That's what I'm looking for, apologies for not wording my post well, wasn't sure how best to explain the issue. How could I combine that with yield return to return every word in a string line? – GurdeepS Feb 23 '10 at 21:30
  • is there an easy way too do this for all except the last line? – Andy Nov 08 '10 at 14:26
0

Using File.ReadAllLines is easy - but not necessarily the most efficient, since it reads the entire line into memory.

A short version would probably be:

var wordsPerLine = from line in File.ReadAllLines(filename)
               select string.Split(line, '\t');

foreach(var line in wordsPerLine)
{
    foreach(word in line)
    {
        // process word...
    }
}

If you want a single enumerable of the words, you can use SelectMany to get that, too...

Reed Copsey
  • 554,122
  • 78
  • 1,158
  • 1,373
  • Definitely not the shortest version. Query operators are too verbose ;) `File.ReadAllLines("file.txt").Select(line => line.Split('\t'))` – Mehrdad Afshari Feb 23 '10 at 17:11
  • Yeah - but the OP liked using query operators, so I left it. – Reed Copsey Feb 23 '10 at 17:13
  • If I had a situation where I had to extract the data from a tab delimited file that is 50MB large, would this be the best approach? – sc_ray Mar 22 '11 at 17:46
  • @sc_ray: Depends - it'll load the entire thing into memory - if you want to prevent that, you might want to read line by line instead of using File.ReadAllLines. – Reed Copsey Mar 22 '11 at 17:54