1

How to print n lines after a matched string from a file using python?

Linux Command grep

 abc@xyz:~/Desktop$ grep -A 10 'foo' bar.txt
      foo
      <shippingcost>
        <amount>3.19</amount>
        <currency>EUR</currency>
      </shippingcost>
      <shippingtype>Normal</shippingtype>

      <quality>GOOD</quality> 
      <unlimitedquantity>false</unlimitedquantity>
      <isrsl>N</isrsl> 
      <stock>1</stock>

This command will print 10 lines after the matched string 'foo' from the file bar.txt

Using Python how to do the same thing?

What I tried:

import re
with open("bar.txt") as origin_file:
for line in origin_file:
    line= re.findall(r'foo', line)
    if line:
        print line

The above Python code gives this the following output:

abc@xyz:~/Desktop$ python grep.py
['foo']
Dipankar Nalui
  • 1,121
  • 5
  • 18
  • 33
  • At first you redefined line, resulting in the list you got printed. Also I don't see any counting of lines, so you have nothing in your code that would even go in the direction of solving the problem. – Klaus D. Apr 30 '18 at 08:13
  • Note that there are **much** better ways to parse XML files in Python. – Charles Duffy Jun 03 '19 at 14:49

2 Answers2

3

file objects such as origin_file are iterators. Not only can you loop through their contents using

for line in origin_file:

but also you can obtain the next item from the iterator using next(origin_file). In fact, you can call next on the iterator from within the for-loop:

import re

# Python 2
with open("bar.txt") as origin_file:
    for line in origin_file:
        if re.search(r'foo', line):
            print line,
            for i in range(10):
                print next(origin_file),

# in Python 3, `print` is a function not a statement
# so the code would have to be change to something like
# with open("bar.txt") as origin_file:
#     for line in origin_file:
#         if re.search(r'foo', line):
#             print(line, end='')
#             for i in range(10):
#                 print(next(origin_file), end='')

The code above will raise a StopIteration error if there are not 10 extra lines after the last foo is found. To handle this possiblity, you could use itertools.islice to slice off at most 10 items from the iterator:

import re
import itertools as IT

with open("bar.txt") as origin_file:
    for line in origin_file:
        if re.search(r'foo', line):
            print line, 
            for line in IT.islice(origin_file, 10):
                print line,

Now the code will end gracefully (without raising a StopIteration exception) even if there are not 10 lines after foo.

unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
-1

that is because you assign to line, and you don't read the lines from the file object, please change it to :

import re
with open("bar.txt") as origin_file:
for line in origin_file.readlines():
    found = re.findall(r'foo', line)
    if found:
        print line
ddor254
  • 1,570
  • 1
  • 12
  • 28
  • This doesn't have the `-A` effect of printing all lines for a given period *after* a match, whether those additional lines match the target or not. – Charles Duffy Jun 03 '19 at 15:18