Python Regex: Symbol + in every letter in the same word

Question

I am using Python. I want to make a regex that allos the following examples:

Day
Dday
Daay
Dayy
Ddaay
Ddayy
...

So, each letter of a word, one or more times. How can I write it easily? Exist an expression that make it easy? I have a lot of words. Thanks

So you want a regex that matches "one or more letters"? Or something else? Can you give an example of something that your regex _shouldn't_ match? — Kevin, Oct 05 '17 at 15:15
Why don't you just split your word into an array of chars and then put it back together with `+` after each character? You'd end up with `d+a+y+`: https://stackoverflow.com/questions/15418561/convert-a-word-to-a-list-of-chars — ctwheels, Oct 05 '17 at 15:16

score 1 · Answer 1 · answered Oct 05 '17 at 15:21

We can try using the following regex pattern:

^([A-Za-z])\1*([A-Za-z])\2*([A-Za-z])\3*$

This matches and captures a single letter, followed by any number of occurrences of this letter. The \1 you see in the above pattern is a backreference which represents the previous matched letter (and so on for \2 and \3).

Code:

word = "DdddddAaaaYyyyy"
matchObj = re.match( r'^([A-Za-z])\1*([A-Za-z])\2*([A-Za-z])\3*$', word, re.M|re.I)

if matchObj:
    print "matchObj.group() : ", matchObj.group()
    print "matchObj.group(1) : ", matchObj.group(1)
    print "matchObj.group(2) : ", matchObj.group(2)
    print "matchObj.group(3) : ", matchObj.group(3)
else:
    print "No match!!"

Demo

Eugene Yarmash · Answer 2 · 2017-10-05T15:48:08.030

To match a character one or more times you can use the + quantifier. To build the full pattern dynamically you would need to split the word to characters and add a + after each of them:

pattern = "".join(char + "+" for char in word)

Then just match the pattern case insensitively.

Demo:

>>> import re
>>> word = "Day"
>>> pattern = "".join(char + "+" for char in word)
>>> pattern
'D+a+y+'
>>> words = ["Dday", "Daay", "Dayy", "Ddaay", "Ddayy"]
>>> all(re.match(pattern, word, re.I) for word in words)
True

score 0 · Answer 3 · answered Oct 05 '17 at 15:20

0

Try /d+a+y+/gi:

d+ Matches d one or more times.
a+ Matches a one or more times.
y+ Matches y one or more times.

answered Oct 05 '17 at 15:20

Ethan

4,295
4
25
44

Hi David, I think the OP only gave this data as an example to show the kind of allowed matches. – Tim Biegeleisen Oct 05 '17 at 15:22

score 0 · Answer 4 · answered Oct 05 '17 at 15:52

As per my original comment, the below does exactly what I explain.

Since you want to be able to use this on many words, I think this is what you're looking for.

import re

word = "day"

regex = r"^"+("+".join(list(word)))+"+$"

test_str = ("Day\n"
    "Dday\n"
    "Daay\n"
    "Dayy\n"
    "Ddaay\n"
    "Ddayy")

matches = re.finditer(regex, test_str, re.IGNORECASE | re.MULTILINE)

for matchNum, match in enumerate(matches):
    matchNum = matchNum + 1

    print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))

This works by converting the string into a list, then converting it back to string, joining it on +, and appending the same. The resulting regex will be ^d+a+y+$. Since the input you presented is separated by newline characters, I've added re.MULTILINE.

Python Regex: Symbol + in every letter in the same word

4 Answers4

Demo