0

I am trying to copy lines four lines before a line that contains a specific keyword.

if line.find("keyword") == 0:
    f.write(line -3)

I don't need the line where I found the keyword, but 4 lines before it. Since the write method doesn't work with line numbers, I got stuck

Bhargav Rao
  • 50,140
  • 28
  • 121
  • 140
user2803490
  • 39
  • 1
  • 8
  • 1
    You are aware that writing in place (assuming that `f` is being iterated over line-by-line) will start overwriting already existing data after the found line, right? If so, just keep a `list` with lines while iterating over them and once a `keyword` is found just write the collected list of lines to the file. – zwer Apr 24 '18 at 22:31
  • How big is your file? – merlin2011 Apr 24 '18 at 22:33
  • @merlin2011 3847 kb ; zwer I have used 2 files, one for reading, another for the writing – user2803490 Apr 24 '18 at 22:36
  • if you don't want to load all the file for any reason, you can have a 4 element list and use `push()` and `pop()` methods like a buffer – anishtain4 Apr 24 '18 at 22:38
  • When you say "copy a line" and use `f.write()` (is `f` pointing to the input file or a different file? you need to show the definition of `f`). Are you trying to output a new (summary) file, overwrite the original file, or just get a list in memory of lines 3 lines before matches? – smci Apr 24 '18 at 22:45
  • This is called a ***sliding-window*** or ***rolling buffer*** – smci Apr 24 '18 at 22:50
  • Related solution, using deque [How can I print second and last three lines...](https://stackoverflow.com/a/11065216/202229). Sadly that question asks for solutions in both AWK and Python, so it's pretty confused. – smci Apr 24 '18 at 22:52
  • Just to point out that if you [**process the file backwards**](https://stackoverflow.com/questions/2301789/read-a-file-in-reverse-order-using-python), you can do all this one shot (assuming you're not trying to overwrite the file). – smci Apr 24 '18 at 23:04

2 Answers2

0

You can just use a list, append to the list each line (and truncate to last 4). When you reach the target line you are done.

last_3 = []
with open("the_dst_file") as fw:
    with open("the_source_file") as fr:
        for line in fr:
            if line.find("keyword") == 0:
                fw.write(last_3[0] + "\n")
                last_3 = []
                continue
            last_3.append(line)
            last_3 = last_3[-3:]

If the format of the file is known in a way that "keyword" will always have at least 3 lines preceding it, and at least 3 lines between instances, then the above is good. If not, then you would need to guard against the write by checking that the len of last_3 is at == 3 before pulling off the first element.

sberry
  • 128,281
  • 18
  • 138
  • 165
  • `last_4` is a rolling buffer (list) of four lines. Should also mention that the syntax `[-4:]` is safe even if you get a hit on the first three lines, it silently returns empty-list. – smci Apr 24 '18 at 22:48
  • Thanks, but I need to run this through the whole document. I can't have just one find. Also, I don't need the lines in between the keyword and line -3 – user2803490 Apr 24 '18 at 22:51
  • Related solution, using deque [How can I print second and last three lines...](https://stackoverflow.com/a/11065216/202229). Sadly that question asks for solutions in both AWK and Python, so it's pretty confused. Should we close as duplicate? – smci Apr 24 '18 at 22:53
  • user2803490: then replace the `break` statement with `f.write`, `print`, `append` or whatever action you want to take. And you only want the third-previous-line `last_4[0]` – smci Apr 24 '18 at 22:54
  • @user2803490 edited. I didn't understand the requirements (also was doing 4 not 3 for some unknown reason) :) – sberry Apr 24 '18 at 23:02
0

If you're already using two files, it's as simple as keeping a buffer and writing out the last 3 entries in it when you encounter a match:

buf = []  # your buffer
with open("in_file", "r") as f_in, open("out_file", "w") as f_out:  # open the in/out files
    for line in f_in:  # iterate the input file line by line
        if "keyword" in line:  # the current line contains a keyword
            f_out.writelines(buf[-3:])  # write the last 3 lines (or less if not available)
            f_out.write(line)  # write the current line, omit if not needed
            buf = []  # reset the buffer
        else:
            buf.append(line)  # add the current line to the buffer
zwer
  • 24,943
  • 3
  • 48
  • 66
  • Related solution, using deque [How can I print second and last three lines...](https://stackoverflow.com/a/11065216/202229). Sadly that question asks for solutions in both AWK and Python, so it's pretty confused. Should we close as duplicate? – smci Apr 24 '18 at 22:53
  • This worked, thanks. But I only need one line, the lines in between are not useful – user2803490 Apr 24 '18 at 22:58
  • @user2803490 - You said you want to copy 4 lines, if you just need the fourth line in the _past_ from the current line use `f_out.writelines(buf[-3:-2])` (to ensure no errors if the buffer doesn't reach that far) and omit writing the current line. – zwer Apr 24 '18 at 23:01