1

I'm trying to detect how many times a word appears in a txt file but the word is connected with other letters.

Detecting Hello

Text: Hellooo, how are you?

Expected output: 1

Here is the code I have now:

total = 0

with open('text.txt') as f:
    for line in f:
        finded = line.find('Hello')
        if finded != -1 and finded != 0:
            total += 1

print total´

Do you know how can I fix this problem?

Markel
  • 65
  • 1
  • 6

2 Answers2

0

For every line, you can iterate through every word by splitting the line on spaces which makes the line into a list of words. Then, iterate through the words and check if the string is in the word:

total = 0

with open('text.txt') as f:
    # Iterate through lines
    for line in f:
        # Iterate through words by splitting on spaces
        for word in line.split(' '):
            # Match string in word
            if 'Hello' in word:
                total += 1

print total
willk
  • 3,727
  • 2
  • 27
  • 44
0

As suggested in the comment by @SruthiV, you can use re.findall from re module,

import re

pattern = re.compile(r"Hello")

total = 0
with open('text.txt', 'r') as fin:
    for line in fin:
         total += len(re.findall(pattern, line))

print total

re.compile creates a pattern for regex to use, here "Hello". Using re.compile improves programs performance and is (by some) recommended for repeated usage of the same pattern. More here.

Remaining part of the program opens the file, reads it line by line, and looks for occurrences of the pattern in every line using re.findall. Since re.findall returns a list of matches, total is updated with the length of that list, i.e. number of matches in a given line.

Note: this program will count all occurrences of Hello- as separate words or as part of other words. Also, it is case sensitive so hello will not be counted.

atru
  • 4,699
  • 2
  • 18
  • 19