Check in python if self designed pattern matches

Question

I have a pattern which looks like:

abc*_def(##)

and i want to look if this matches for some strings. E.x. it matches for:

abc1_def23
abc10_def99

but does not match for:

abc9_def9

So the * stands for a number which can have one or more digits. The # stands for a number with one digit I want the value in the parenthesis as result

What would be the easiest and simplest solution for this problem? Replace the * and # through regex expression and then look if they match? Like this:

    pattern = pattern.replace('*', '[0-9]*')
    pattern = pattern.replace('#', '[0-9]')
    pattern = '^' + pattern + '$'

Or program it myself?

I'm a bit confused by your question. Why is hat your regular expression? Do you really want to match 0 or more copies of the letter c? Are you looking for something like `r'abc\d+_def$\d\d$'`? — abarnert, Mar 17 '18 at 06:09
i have many of the pattern which look like "abc*_def(##)". And was thinking that maybe there is an other solution than replacing the * and # through real regex expressions — Sir2B, Mar 17 '18 at 13:02

score 0 · Answer 1 · answered Mar 17 '18 at 07:39

Based on your requirements, I would go for a regex for the simple reason it's already available and tested, so it's easiest as you were asking.

The only "complicated" thing in your requirements is avoiding after def the same digit you have after abc. This can be done with a negative backreference. The regex you can use is:

\babc(\d+)_def((?!\1)\d{1,2})\b

\b captures word boundaries; if you enclose your regex between two \b you will restrict your search to words, i.e. text delimited by space, punctuations etc
abc captures the string abc
\d+ captures one or more digits; if there is an upper limit to the number of digits you want, it has to be \d{1,MAX} where MAX is your maximum number of digits; anyway \d stands for a digit and + indicates 1 or more repetitions
(\d+) is a group: the use of parenthesis defines \d+ as something you want to "remember" inside your regex; it's somehow similar to defining a variable; in this case, (\d+) is your first group since you defined no other groups before it (i.e. to its left)
_def captures the string _def
(?!\1) is the part where you say "I don't want to repeat the first group after _def. \1 represents the first group, while (?!whatever) is a check that results positive is what follows the current position is NOT (the negation is given by !) whatever you want to negate.

Live demo here.

Michael Swartz · Answer 2 · 2018-03-17T08:09:56.213

0

I had the hardest time getting this to work. The trick was the $

#!python2

import re

yourlist = ['abc1_def23', 'abc10_def99', 'abc9_def9', 'abc955_def9', 'abc_def9', 'abc9_def9288', 'abc49_def9234']

for item in yourlist:
    if re.search(r'abc[0-9]+_def[0-9][0-9]$', item):
        print item, 'is a match'

edited Mar 17 '18 at 08:09

answered Mar 17 '18 at 08:00

Michael Swartz

858
2
15
27

The fourth bird · Answer 3 · 2018-03-18T09:16:38.123

You could match your pattern like:

abc\d+_def(\d{2})

abc Match literally
\d+ Match 1 or more digits
_ Match underscore
def - Match literally
( Capturing group (Your 2 digits will be in this group)
- \d{2} Match 2 digits
) Close capturing group

Then you could for example use search to check for a match and use .group(1) to get the digits between parenthesis.

Demo Python

You could also add word boundaries:

\babc\d+_def(\d{2})\b

Check in python if self designed pattern matches

3 Answers3