0

I would like to achieve the following in Python: I want to be able to extract the 10 lines before after the word "apple" from a directory (with subdirectories) full of HTML files. I want to print out the lines into a CSV file. Ideally, the CSV file will contain two variables: 1) the HTML filename and 2) the 10 lines before and after the word "apple".

UPDATE: I was able to extract.

import collections
import itertools
import sys
import csv
import glob

for filepath in glob.glob('**/*.html', recursive=True):
    with open(filepath) as f:
        before = collections.deque(maxlen=10)
        for line in f:
            if 'peer' in line:
                sys.stdout.writelines(before)
                sys.stdout.write(line)
                sys.stdout.writelines(itertools.islice(f, 10))
                break
            results=before.append(line)
            print(results)

I will look into the CSV step, but any help will be appreciated

hy9fesh
  • 589
  • 2
  • 15
  • What is the question? What part of your solution are you trying to fix? This isn't a discussion forum, please take the time to read [ask] and the other links found on that page. – wwii Aug 05 '19 at 02:56
  • `open('names.csv', 'wb') as f` - you opened a file for writing then you tried to read from it - `for row in f:`. That's why it throws an error. – wwii Aug 05 '19 at 02:59
  • I have edited the question according to the How to Ask page and I hope I am much clearer now. Thank you. – hy9fesh Aug 05 '19 at 05:42
  • Possible duplicate: [How can I iterate over files in a given directory?](https://stackoverflow.com/questions/10377998/how-can-i-iterate-over-files-in-a-given-directory) – wwii Aug 05 '19 at 14:14
  • https://docs.python.org/3/library/csv.html – wwii Aug 05 '19 at 15:06
  • I have edited the post. I will post the CSV portion once I figure it out. – hy9fesh Aug 05 '19 at 18:07

0 Answers0