Creating a dictionary from a string where the values are the vowel counts from each word?

Question

I have the following string:

S = "to be or not to be, that is the question?"

I want to be able to create a dictionary that has the output of

{'question': 4, 'is': 1, 'be,': 1, 'or': 1, 'the': 1, 'that': 1, 'be': 1, 'to': 1, 'not': 1}

where I get the number of vowels in each word next to the word, not the count of each word itself. So far I have:

{x:y for x in S.split() for y in [sum(1 for char in word if char.lower() in set('aeiou')) for word in S.split()]}

with an output of:

{'or': 4, 'the': 4, 'question?': 4, 'be,': 4, 'that': 4, 'to': 4, 'be': 4, 'is': 4, 'not': 4}

How do I get a dictionary from a string where the values are the vowel counts from each word?

`{'tell':1, 'me':1, 'what':1, 'I':1, 'tell':1, 'you,':2, 'to':1, 'you':2}` is not a valid dictionary, because there's keys in there multiple times. — Marcus Müller, Sep 14 '15 at 16:41
Nikki, welcome to StackOverflow I didn't think this is a -6 question, so I upvoted it. In the future, try to clearly set apart your question and state it in the form of a question so that you don't get this reception again. If you accept an answer, it will give you plus two to your rep. Cheers. I'll try to help you restate the question here. — Russia Must Remove Putin, Sep 14 '15 at 17:01

score 1 · Answer 1 · answered Sep 14 '15 at 16:49

number of vowels in each word next to the word, not the count of each word itself?

>>> s = "to be or not to be, that is the question"

first remove punctuation:

>>> new_s = s.translate(None, ',?!.')
>>> new_s
'to be or not to be that is the question'

then split on the whitespace:

>>> split = new_s.split()
>>> split
['to', 'be', 'or', 'not', 'to', 'be', 'that', 'is', 'the', 'question']

Now count the vowels in a dictionary. Note there are no redundant counts:

>>> vowel_count = {i: sum(c.lower() in 'aeiou' for c in i) for i in split}
>>> vowel_count
{'be': 1, 'that': 1, 'is': 1, 'question': 4, 'to': 1, 'not': 1, 'the': 1, 'or': 1}

Nir Alfasi · Answer 2 · 2015-09-14T17:01:54.380

-1

You can use re (regex module) to find all the valid words (\w+ - does not includes spaces and commas), and use Counter to check the frequencies:

import re

from collections import Counter
s = "tell me what I tell you, to you"
print Counter(re.findall(r'\w+', s))

OUTPUT

Counter({'you': 2, 'tell': 2, 'me': 1, 'what': 1, 'I': 1, 'to': 1})

edited Sep 14 '15 at 17:01

answered Sep 14 '15 at 16:46

Nir Alfasi

53,191
11
86
129

Weird downvote... it would be better if an additional feedback would have been posted though a comment. – Nir Alfasi Sep 15 '15 at 02:51

Creating a dictionary from a string where the values are the vowel counts from each word?

2 Answers2