0

Lets say I have a text file containing

Section 1
What: random1 
When: random2
Why:  random3 
Where: random4
How: random5
Section 2
What: dog1
When: dog2
Why: dog3
Where: dog4
How: dog5
Section 3
What: me1
When: me2
Why: me3
Where: me4
How: me5

I want to create a function to take the text file and look for two words and copy everything in between and keep collecting data and put it in a new text file.

For example: def my_function(document, start, end): in the interaction window I would put my_function("testing.txt, "when", "why") and it should create a new text file containing the data:

when: random2
when: dog2
when: me2

So the function takes all the data between those two words and those two words occur more than once so it would have to keep looking through the file.

A user in a different thread has posted a solution that may help me but I am not sure how to put it in a function and I don't understand the code used.

This is from a different thread, solution by: falsetru

import itertools

with open('data.txt', 'r') as f, open('result.txt', 'w') as fout:
   while True:
      it = itertools.dropwhile(lambda line: line.strip() != 'Start', f)
      if next(it, None) is None: break
      fout.writelines(itertools.takewhile(lambda line: line.strip() != 'End', it))
Community
  • 1
  • 1
  • Is the function returning the "headings" (Who, What, When, Where, Why) or calculating them? I'm asking because in your example, the case is different in the output than the input. – mojo Jan 12 '14 at 03:46
  • The function is returning the start (which is the word "when" from the example) and all the words after it until it reaches the end (which is the word "why" – user3184242 Jan 12 '14 at 04:03

2 Answers2

0
def fn(fname, start, end):
    do_print = False
    for line in open(fname).readlines():
        if line.lower().startswith(start.lower()):
            do_print = True
        elif line.lower().startswith(end.lower()):
            do_print = False
        if do_print:
            print(line.strip())

This produces the output:

>>> fn('testing.txt', 'when', 'why')
When: random2
When: dog2
When: me2

It works by just going through the file line by line and setting a flag True whenever the line starts with start and False whenever the line starts with end. When the flag is True, the line is printed.

Since the examples in the post had mixed case, I used the method lower to make the tests case-insensitive.

John1024
  • 109,961
  • 14
  • 137
  • 171
0

This will do what you describe. I've added a dest_path input to specify the output file.

def my_function(source_path, dest_path, start_text, stop_text):
    # pre-format start and end to save time in loop (for case insensitive match)
    lower_start = start_text.lower()
    lower_stop = stop_text.lower()
    # safely open input and output files
    with open(source_path, 'r') as source, open(dest_path, 'w') as dest:
        # this variable controls if we're writing to the destination file or not
        writing = False
        # go through each line of the source file
        for line in source:
            # test if it's a starting or ending line
            if line.lower().startswith(lower_start): writing = True
            elif line.lower().startswith(lower_stop): writing = False
            # write line to destination file if needed
            if writing: dest.write(line)

Note that the files are closed automatically when the with block finishes.

Pi Marillion
  • 4,465
  • 1
  • 19
  • 20