4
text = "Life is beautiful"
pattern = r"[aeiou]{3,}"
result = re.findall(pattern, text)
print(result)

desired result: ['beautiful']

the output I get: ['eau']

I have tried googling and etc....I found multiple answers but none of them worked!! I am new to regex so maybe I am having issues but I am not sure how to get this to out

I have tried using r"\b[abcde]{3,}\b" still nothing SO please help!!

Prab
  • 464
  • 3
  • 13

5 Answers5

4

Your regex only captures the 3 consecutive vowels, so you need to expand it to capture the rest of the word. This can be done by looking for a sequence of letters between two word breaks and using a positive lookahead for 3 consecutive vowels within the sequence. For example:

import re

text = "Life is beautiful"
pattern = r"\b(?=[a-z]*[aeiou]{3})[a-z]+\b"
result = re.findall(pattern, text, re.I)
print(result)

Output:

['beautiful']
Nick
  • 138,499
  • 22
  • 57
  • 95
  • thank you very much for your answer, I was trying to figure this out! and I wrote it this way ```\b[a-z]+[aeiou]{3,}[a-z]+\b``` and it outputs the same result!! would you be able to explain what ?= is doing there?? – Prab May 29 '20 at 06:27
  • Also, this solution is not able to print ```"Obviously, the queen is courageous and gracious."``` as in it skips ``Obviously`` but prints the rest!! – Prab May 29 '20 at 06:28
  • @PrabinTamang `(?=` is a forward lookahead that asserts that at that point in parsing (just after the word break) there is some number of letters followed by 3 vowels. What you have done is essentially the same and the lookahead isn't really necessary. The regex does work for `Obviously` if you use the `re.I` flag, see https://ideone.com/P3yI1V – Nick May 29 '20 at 06:50
  • thank you very much for your help!! and yes that worked using the ignore case. – Prab May 29 '20 at 08:28
  • I tried this method with this text = ('Life', 'is', 'beautiful') pattern = r"\b(?=[a-z]*[aeiou]{3})[a-z]+\b" result = re.findall, [pattern, text, re.I] print(result) but it doesn't work. I'm new to this. What am I doing wrong? I tried changing some things but still same. I was going to ask a question but if I can get an answer here it would be great. @Nick – Riel Feb 28 '21 at 02:28
  • Hi Riel, if you have a question, you should always ask it unless it really can be answered by this one. You can refer to this one for context if necessary. But basically your issue is that your `text` is a list, not a string; you need to search each of the strings in `text` separately, something like `for t in text: result= re.findall(pattern, t,re.I) print(result)` – Nick Feb 28 '21 at 10:46
1

The first part of the regex looks for all letters, upper or lower case (as well as numbers and underscores which is not necessary), but for the purpose of this question it works. We just need to find characters leading up to a word with (at least) three vowels in a row. Then we finish it off by looking for lowercase letters at the tail end of the word if any remain.

pattern = r"[\w]+[aeiou]{3,}[a-z]+"
0

A little improvement on the former solution would be using \w instead a-z as the character classes (This will match lower and uppercase letters)

\b[\w]+[aeiou]{3,}[\w]+\b

Cheers!

gzoanetti
  • 1
  • 1
0
pattern=r"\b\w*[aeiou]{3,}\w*\b"

\w* For any alpha Alphanumerics that "could" exist before and after the vowels

0

I know it's a late reply but just wanted to share this for whoever searches this up!

Answer: pattern = r"\b[a-zA-Z][aeiou]{3,}[a-z]\b" OR experimenting with [\w]* instead of [a-zA-Z]

change the first match from [a-z] to [a-zA-Z]

Zac
  • 1