1

Let's say I have a text file in this format:

***a
foo bar
lorem ipsum
dolor
---a

I want to print the lines between ***a and ---a I'm trying to do it with this:

def printlines():
    pattern = open('text.txt').read().splitlines()
    for line in pattern:
        if line == "***a":
            pass
            while line != "---a":
                print line

        else:
            pass

But it prints ***a in an infinite loop. How can I solve this?

JayGatsby
  • 1,541
  • 6
  • 21
  • 41

3 Answers3

3

Use a state machine. That means, once you see your opening pattern, set a state so you know that the following lines are now relevant to you. Then keep looking out for the ending pattern to turn it off:

def printlines():
    # this is our state
    isWithin = False

    with open('text.txt') as f:
        for line in f:
            # Since the line contains the line breaking character,
            # we have to remove that first
            line = line.rstrip()

            # check for the patterns to change the state
            if line == "***a":
                isWithin = True
            elif line == "---a":
                isWithin = False

            # check whether we’re within our state
            elif isWithin:
                print line

Since we only print once we’re in the isWithin state, we can easily skip any part out side of the ***a/---a pattern. So processing the following file would correctly print out Hello and World and nothing else:

Foo
***a
Hello
---a
Bar
***a
World
---a
Baz

Also, you should use the with statement to open your file, and iterate over the file object directly instead of reading it and calling splitlines(). That way you make sure that the file is properly closed, and you only ever read one line after another, making this more memory efficient.

poke
  • 369,085
  • 72
  • 557
  • 602
2

Use break and continue:

def printlines():
    pattern = open('text.txt').read().splitlines()
    for line in pattern:
        if line == "***a":
           continue
        if line == "---a":
           break
        print line

Break

The break statement, like in C, breaks out of the smallest enclosing for or while loop.

Continue

The continue statement, also borrowed from C, continues with the next iteration of the loop.

Juan Diego Godoy Robles
  • 14,447
  • 2
  • 38
  • 52
1

If you have multiple occurrences you can start an inner loop when you hit the start line, which is equivalent to what your while is trying to do:

with open("test.txt") as f:
    for line in f:
        if line.rstrip() == "***a":
            print("")
            for line in f:
                if line.rstrip() == "---a":
                    break
                print(line.rstrip())

Which for:

***a
foo bar
lorem ipsum
dolor
---a
***a
bar bar
foobar
foob
---a

Would output:

foo bar
lorem ipsum
dolor

bar bar
foobar
foob

If you want to have lines without the newlines we can map them off and still iterate line by line:

with open("test.txt") as f:
    # itertools.imap python2
    f = map(str.rstrip, f)
    for line in f:
        if line == "***a":
            print("")
            for line in f:
                if line == "---a":
                    break
                print(line)
Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321