How can I find if a word (string) occurs more than once in an input/list in python

Question

For example if an example input is: ASK NOT WHAT YOUR COUNTRY CAN DO FOR YOU ASK WHAT YOU CAN DO FOR YOUR COUNTRY

My program must return: The word ‘COUNTRY’ occurs in the 5th and 17th positions.

I only need help for the part in finding if the string occurs more than once.

This is my attempt so far, I am new in python so sorry if my question seems too easily answered.

# wordsList=[]
words=input("Enter a sentence without punctuation:\n")
# wordsList.append(words)
# print(wordsList)
for i in words:
    if i in words>1:
        print(words)
# words.split("  ")
# print(words[0])

Please include the code you wrote that isn't producing the right output — dfundako, Apr 01 '20 at 18:08
Sorry for the inconvenience, I edited my post, unfortunately my program comes with an error even. — David Dallakyan, Apr 01 '20 at 18:14
Take a look at this old question: https://stackoverflow.com/questions/4664850/how-to-find-all-occurrences-of-a-substring — Arne, Apr 01 '20 at 18:18
What happens if the word is in the string 10,000 times? Should it say "The word XYZ occurs in the 1st and 2nd and 3rd and 4th and 5th and 6th....."? — dfundako, Apr 01 '20 at 18:19

Keivan · Answer 1 · 2020-04-01T18:35:30.103

To find the number of occurences

There are probably several ways of doing it. One simple way would be to split your sentence to a list and find the number of occurrences.

sentence = "ASK NOT WHAT YOUR COUNTRY CAN DO FOR YOU ASK WHAT YOU CAN DO FOR YOUR COUNTRY" 
words_in_a_list = sentence.split(" ")
words_in_a_list.count("COUNTRY")

You could also use regular expressions and would also be very easy to do.

import re

m = re.findall("COUNTRY", sentence)

To find the location of each occurrence

Probably you want to read this post. You can use search which returns the span as well. And write a loop to find them all. Once you know the location of the first one, start searching the string from so many chars further.

def count_num_occurences(word, sentence):
    start = 0
    pattern = re.compile(word)
    start_locations = []
    while True:
        match_object = there.search(sentence, start)

        if match_object is not None:
            start_locations.append(match_object.start())
            start = 1 + match_object.start()
        else:
            break
    return start_locations

How does this produce the desired output of which positions the words are in? — dfundako, Apr 01 '20 at 18:19

Mace · Answer 2 · 2020-04-01T18:46:59.610

str = 'ASK NOT WHAT YOUR COUNTRY CAN DO FOR YOU ASK WHAT YOU CAN DO FOR YOUR COUNTRY'

# split your sentence and make it a set to get the unique parts
# then make it a list so you ca iterate
parts = list(set(str.split(' ')))

# you count to get the nr of occurences of parts in the str
for part in parts:
    print(f'{part} {str.count(part)}x')

result

COUNTRY 2x
YOU 4x
ASK 2x
YOUR 2x
CAN 2x
NOT 1x
DO 2x
WHAT 2x
FOR 2x

or with positions

import re

str = 'ASK NOT WHAT YOUR COUNTRY CAN DO FOR YOU ASK WHAT YOU CAN DO FOR DO YOUR COUNTRY'

# split your sentence and make it a set to get the unique parts
# then make it a list so you ca iterate
parts = list(set(str.split(' ')))

# you count to get the nr of occurences of parts in the str
for part in parts:
    test = re.findall(part, str)
    print(f'{part} {str.count(part)}x')
    for m in re.finditer(part, str):
        print('     found at', m.start())

result

DO 3x
     found at 30
     found at 58
     found at 65
ASK 2x
     found at 0
     found at 41
COUNTRY 2x
     found at 18
     found at 73
YOUR 2x
     found at 13
     found at 68
WHAT 2x
     found at 8
     found at 45
YOU 4x
     found at 13
     found at 37
     found at 50
     found at 68
NOT 1x
     found at 4
FOR 2x
     found at 33
     found at 61
CAN 2x
     found at 26
     found at 54

Thank you for your help. May I ask in the last line what f' does. Does it call a function? — David Dallakyan, Apr 01 '20 at 18:45
Typo, sorry I will remove it. f' { ]' is used for formatting text with variables. See f'{part} {str.count(part)}x'. It makes a string with the values of the 2 variables inside {} . — Mace, Apr 01 '20 at 18:46

mattyx17 · Answer 3 · 2020-04-01T19:03:55.610

If you want only the words that occur more than once:

words=input("Enter a sentence without punctuation:\n").strip().split()
word_counts = {}

for word in words:
    if word in word_counts:
        word_counts[word] += 1
    else:
        word_counts[word] = 1

for word in word_counts.keys():
    if word_counts[word] > 1:
        print(word)

Just storing all the counts in a dictionary and then looping through the dictionary to print the ones that occur more than once.

Also efficient as it only goes through the input once and then once more through the dictionary

If you want the actual positions of the words:

words=input("Enter a sentence without punctuation:\n").strip().split()
word_counts = {}

for i in len(words):
    word = words[i]
    if word in word_counts:
        word_counts[word].append(i) // keep a list of indices
    else:
        word_counts[word] = [i]

for word in word_counts.keys():
    if len(word_counts[word]) > 1:
        print("{0} found in positions: {1}".format(word, word_counts[word]))

May I ask how word_counts is important to be a dictionary instead of a list and how that helps. I am a beginner in python. Thank you in advance. — David Dallakyan, Apr 01 '20 at 18:44
So by having word counts as a dictionary you store the word along with its count. So the first loop would produce something like `{"ASK": 2, "NOT": 1, "WHAT": 2, ...}` Then the second loop prints out the words where the corresponding count is greater than 1. If you used a list you would have to loop through the list for each word to get its count and then print it. This is O(n^2) with a dictionary you only loop through the words twice — mattyx17, Apr 01 '20 at 18:51

How can I find if a word (string) occurs more than once in an input/list in python

3 Answers3

To find the number of occurences

To find the location of each occurrence