-2
def line_number(word, fname):
    with open(fname) as f:
        number_list = ""
        for i, line in enumerate(f, 1):
            if word in line:
                number_list += (str(i)+", ")      
        return number_list[:-2]

The function above is suppose to find the line number on a txt file which a matching string occurs. However, for example, if we are searching the string "yes", and we have string "yes" on the 20th line and "eyes" on the 51st line, the function will return line 20 and line 51 because line 51 contains a substring "yes" in "eyes", how can I fix this bug?

Okay, I have solved the problem by changing if word in line: to if word in re.split('(\W+)', line):

By doing so I split the line into words and punctuation to find the exact match.

But I noticed anohter problem. For example, on line 159, there is a sentence of "you you you." the word you appear 3 times, the function only counts you appear once in line 159, and the program prints out:

you 159

but I want the function to count it 3 times and prints out:

you 159, 159, 159

Is there any way to do it?

Simon MᶜKenzie
  • 8,344
  • 13
  • 50
  • 77
user3697665
  • 297
  • 1
  • 6
  • 17

2 Answers2

1

To include duplicate matches on a row, you can do this using re.findall:

re.findall(pattern, string, flags=0)

Return all non-overlapping matches of pattern in string, as a list of strings

Just replace this:

if word in line:

with this:

for match in re.findall(r'\b' + word + r'\b', line):
Community
  • 1
  • 1
Simon MᶜKenzie
  • 8,344
  • 13
  • 50
  • 77
  • 1
    thanks a lot!!!! this one totally works, it solves the problem of findding the extra substring, and it counts the whole line!! – user3697665 Jun 02 '14 at 01:32
-1

Substitute if word in line for if word == line.strip().

RodrigoOlmo
  • 704
  • 1
  • 5
  • 10