0

I'm looking for a way to find words with the exact number of the given character.

For example: If we have this input: ['teststring1','strringrr','wow','strarirngr'] and we are looking for 4 r characters It will return only ['strringrr','strarirngr'] because they are the words with 4 letters r in it.

I decided to use regex and read the documentation and I can't find a function that satisfies my needs. I tried with [r{4}] but it apparently returns any word with letters r in it. Please help

Ismael Padilla
  • 5,246
  • 4
  • 23
  • 35
  • Does this answer your question? [Count the number occurrences of a character in a string](https://stackoverflow.com/questions/1155617/count-the-number-occurrences-of-a-character-in-a-string) – AMC Jan 20 '20 at 03:02

4 Answers4

1

something like this:

import collections

def map_characters(string):
    characters = collections.defaultdict(lambda: 0)
    for char in string:
        characters[char] += 1
    return characters



items = ['teststring1','strringrr','wow','strarirngr']

for item in items:
    characters_map = map_characters(item)
    # if any of string has 4 identical letters
    # we print it
    if max(characters_map.values()) >= 4:
        print(item)

# in the result it outputs strringrr and strarirngr
# because these words have 4 r letters
alex2007v
  • 1,230
  • 8
  • 12
0

You can use str.count() to count the occurrences of a character, combined with list comprehensions to create a new list:

myArray =  ['teststring1','strringrr','wow','strarirngr']

letter = "r"
amount = 4

filtered = [item for item in myArray if item.count(letter) == amount]
print(filtered) # ['strringrr', 'strarirngr']

If you wanted to make this reusable (to look for different letters or different amounts), you could pack it into a function:

def filterList(stringList, pattern, occurrences):
    return [item for item in stringList if item.count(pattern)==occurrences]


myArray =  ['teststring1','strringrr','wow','strarirngr']
letter = "r"
amount = 4

print(filterList(myArray, letter, amount)) # ['strringrr', 'strarirngr']
Ismael Padilla
  • 5,246
  • 4
  • 23
  • 35
0

The square brackets are for matching any items in the set, e.g. [abc] matches any words with a,b or c. In your case, it evaluates to [rrrr], so any one r is a match. Try it without the brackets: r{4}

Ella Blackledge
  • 329
  • 1
  • 7
0

Since you asked about using regex, you could use the following:

import re
l = ['teststring1', 'strringrr', 'wow', 'strarirngr']
[ word for word in l if re.match(r'(.*r.*){4}', word) ]

output: ['strringrr', 'strarirngr']

gregory
  • 10,969
  • 2
  • 30
  • 42