Output the line number for all string matches within a file

Question

def line_number(word, fname):
    with open(fname) as f:
        number_list = ""
        for i, line in enumerate(f, 1):
            if word in line:
                number_list += (str(i)+", ")      
        return number_list[:-2]

The function above is suppose to find the line number on a txt file which a matching string occurs. However, for example, if we are searching the string "yes", and we have string "yes" on the 20th line and "eyes" on the 51st line, the function will return line 20 and line 51 because line 51 contains a substring "yes" in "eyes", how can I fix this bug?

Okay, I have solved the problem by changing if word in line: to if word in re.split('(\W+)', line):

By doing so I split the line into words and punctuation to find the exact match.

But I noticed anohter problem. For example, on line 159, there is a sentence of "you you you." the word you appear 3 times, the function only counts you appear once in line 159, and the program prints out:

you 159

but I want the function to count it 3 times and prints out:

you 159, 159, 159

Is there any way to do it?

thanks, i have solved the problem, but i noticed there's another problem, can you take a look? — user3697665, Jun 01 '14 at 23:35

score 1 · Answer 1 · edited Jun 20 '20 at 09:12

1

To include duplicate matches on a row, you can do this using re.findall:

re.findall(pattern, string, flags=0)

Return all non-overlapping matches of pattern in string, as a list of strings

Just replace this:

if word in line:

with this:

for match in re.findall(r'\b' + word + r'\b', line):

edited Jun 20 '20 at 09:12

Community

1
1

answered Jun 02 '14 at 00:55

Simon MᶜKenzie

8,344
13
50
77

1

thanks a lot!!!! this one totally works, it solves the problem of findding the extra substring, and it counts the whole line!! – user3697665 Jun 02 '14 at 01:32

RodrigoOlmo · Answer 2 · 2014-06-01T22:11:55.567

-1

Substitute if word in line for if word == line.strip().

edited Jun 01 '14 at 22:11

answered Jun 01 '14 at 21:55

RodrigoOlmo

704
1
5
10

it does not work, a word cant be exactly matching with a line, am i correct? – user3697665 Jun 01 '14 at 22:05
You are right, each line has a line break (\n) at the end. I edit my answer. – RodrigoOlmo Jun 01 '14 at 22:11
it doesn't work either, the print fails to display the line number. inspired by your strip() method, i tried to use the slpit() method instead, break the line into words, and find the matches. but it comes out an erro saying "'str' object has no attribute 'slipt'", im sure the "line" is a string type and can be split, could you tell what happened here? – user3697665 Jun 01 '14 at 22:34
Well, first of all you are misspelling split :P – RodrigoOlmo Jun 01 '14 at 22:35
Why are you doing this? `return number_list[:-2]` – RodrigoOlmo Jun 01 '14 at 22:37
I added a comma and a space between each line number, just for looking clean, but the last line number will have a comma and space at the end too, so I do return number_list[:-2] to get rid of it =P – user3697665 Jun 01 '14 at 22:41
Haha, I see... What is exactly the error you get with my solution? I am trying that and works fine. – RodrigoOlmo Jun 01 '14 at 22:44
It seems like if I replace "in" with "==", the function fails to return any line number, same with my split()method, I can only use "in". – user3697665 Jun 01 '14 at 22:51
yes, and using strip() – user3697665 Jun 01 '14 at 22:54
That is weird, it works fine for me... – RodrigoOlmo Jun 01 '14 at 22:55
I'm using the 3.3 version, what version is yours? – user3697665 Jun 01 '14 at 23:00

Output the line number for all string matches within a file

2 Answers2