0

the language I am using is python

the problem is: Write a function that takes one argument (a filename). File contains various text lines and occasionally a phone number in them (i.e. not all lines contain phone number). Read given file line by line and search for a phone number (using regular expression) in it, if phone number exists in given line, write this line to phone-number-containing-lines.txt otherwise write this line to plain-lines.txt. As a result, some lines will be in one file and others will be in second file.

this is the code I've come up with:

import re

f1 = open('phonenumber.txt', 'r')
regex = re.compile(r'\d\d\d-\d\d\d-\d\d\d\d')

for line in f1:
    phone_numbers = regex.findall(line)
    for num in phone_numbers:
        f = open('phone-number-containing-lines.txt', 'w')
        f.writelines(num)
        f.close()

f2 = open('phonenumber.txt','r')    
searchquery = re.compile(r'^[^\d]*$')

for line in f2:
    plain_text = regex.findall(line)
    for txt in plain_text:
        d = open('plain-lines.txt', 'w')
        d.writelines(txt)
        d.close()

I don't get any kind of an error but I also just ended up with phone-number-containing-lines.txt only having one of the phone numbers and none of the text from that line, and plain-lines.txt is completely empty

MrFlick
  • 195,160
  • 17
  • 277
  • 295
Rachel
  • 1

1 Answers1

0

Your issue:

Every type you open a file with w it rewrites it. So you are only getting the last line that contains a phone number.

Solution:

You could open it with a for append but that won't be efficient. You should open the file once. Lasting, consider using re.search() instead of re.findall because all you care about is if the line contains a phone number. Your current solution will right the line to the output multiple times if the line contains multiple phone numbers:

import re

PATTERN = re.compile(r'[0-9]{3}-[0-9]{3}-[0-9]{4}')

with open('phonenumber.txt') as f1, open('phone-number-containing-lines.txt', 'w') as f2:
    for line in f1:
        if PATTERN.search(line):
            f2.write(line)

Related Difference between modes a, a+, w, w+, and r+ in built-in open function?