0

What's the best way of reading only the specific lines (based on matching text) from a file? This is what I'm doing now:

match_txt = "lhcb"
for inFile in os.listdir('.'):
    readFile = open(inFile, 'r')
    lines = readFile.readlines()
    readFile.close()

    for line in lines:
        if line.find(match_txt)==0:
           #< do stuff here >

i.e. I'm reading the lines, only with "lhcb" in it, from all the files in the present directory one by one. Is it the best way of doing that? Can it be done without loading the whole file in the memory in the first place?

Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
MacUsers
  • 2,091
  • 3
  • 35
  • 56
  • possible duplicate of [Reading specific lines only (Python)](http://stackoverflow.com/questions/2081836/reading-specific-lines-only-python) – S.Lott Feb 14 '11 at 03:39
  • The best way? Start with search. http://stackoverflow.com/questions/2081836/reading-specific-lines-only-python – S.Lott Feb 14 '11 at 03:39

2 Answers2

4

To do it without loading the whole file into memory, just iterate over the file:

match_txt = "lhcb"
for file_name in os.listdir('.'):
    with open(file_name) as f:
        for line in f:
            if line.startswith(match_txt):
                #< do stuff here >

If you want to check for match_txt anywhere in the line, you can use

if match_txt in line:

Your example code is equivalent to checking if the line starts with match_txt though.

If you're using a very old version of Python that doesn't have the with statement, you'll have to close the file manually:

match_txt = "lhcb"
for file_name in os.listdir('.'):
    f = open(file_name)
    for line in f:
        if line.startswith(match_txt):
            #< do stuff here >
    f.close()
Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
Sven Marnach
  • 574,206
  • 118
  • 941
  • 841
  • Where should I close() the file? – MacUsers Feb 13 '11 at 23:58
  • I don't seem to be very lucky with "with" statement. It appears to be introduced in th v2.5 but I'm using v2.3 and unfortunately, I've to stick to v2.3 on this particular box. Is there any other way at all? Cheers!! – MacUsers Feb 14 '11 at 00:18
  • @MacUsers: I've taken the liberty of editing Sven's answer to include an example that doesn't use `with`. – Greg Hewgill Feb 14 '11 at 00:37
3

You should look into the str.startswith() function.

if line.startswith("text"):

http://docs.python.org/library/stdtypes.html#str.startswith

Guillaume
  • 1,022
  • 1
  • 9
  • 17
  • I actually did not notice at first that the check is only for `match_txt` at the beginning of the line, so +1 for you! – Sven Marnach Feb 13 '11 at 23:29
  • @Sven, @Enders: does str.find() only work if the line start with that? Actually, where ever I tried find(), the "match" was at the starting, and it worked. Although, that's not what I was actially doing. Thanks for the info. – MacUsers Feb 13 '11 at 23:50
  • @MacUsers: `str.find()` returns the index at which the string was found, or -1 if ot was not found at all. The `== 0` part in your condition will only catch the case that the string starts with the match. You could use `line.find(match_txt) >= 0`, but this would be equivalent to the more intuitive `match_txt in line`. – Sven Marnach Feb 13 '11 at 23:54