regex doesn't do what I think I tell it to do

Question

I have this string: FW:MKEQP2B4.BIN

I need to seperate the part after FW: from the rest of it. Since this is just a part of a bigger string and for different string the FW: can be formatted differently I decided to use regex.

I have this regex FW:.*(\S+).* but it matches my string but .group(1) is just a single character 'N'. I use .search()

Why does regex behave so odd for such a seemingly easy task...

Your assumption is incorrect; regex *does* do what you tell it to. What you call "odd behavior" is, in fact, [documented](http://www.regular-expressions.info/repeat.html) and [discussed on SO](http://stackoverflow.com/questions/2301285/what-do-lazy-and-greedy-mean-in-the-context-of-regular-expressions). — Jongware, Jul 07 '14 at 10:50
If you'll always have an 8.3 filename at the end of the string, just use `my_str[-12:]` — jonrsharpe, Jul 07 '14 at 11:13

score 3 · Accepted Answer · answered Jul 07 '14 at 10:47

3

Your regex does exactly what you are telling it to do. Its just that you are not putting it right .

I guess you are looking for :

FW:.*?(\S+).*?

the lazy one !

infact you dont even need the .*?

with FW:(\S+), the result in $1 will be MKEQP2B4.BIN

answered Jul 07 '14 at 10:47

aelor

10,892
3
32
48

thx, the regex engine I used up to now didn`t need the `?`. I also do need the space check since this is human input and the space can be there – user3021085 Jul 07 '14 at 11:13
1

Mmm not sure this is very ethical aelor... I normally support your work, but I saw how your solution gradually morphed into mine. – zx81 Jul 07 '14 at 11:17

score 1 · Answer 2 · answered Jul 07 '14 at 10:48

1

This would work:

match = re.search(r"FW:(\S+)", subject)
if match:
    result = match.group(1)
else:
    result = ""

In the demo, see Group 1 in the right pane.

answered Jul 07 '14 at 10:48

zx81

41,100
9
89
105

score 0 · Answer 3 · answered Jul 07 '14 at 10:48

You could try the below regex to get the string after FW:,

>>> import re
>>> str = 'FW:MKEQP2B4.BIN'
>>> m = re.search(r'(?<=FW:).*', str)
>>> m.group()
'MKEQP2B4.BIN'

If you don't want the .BIN part then try this,

>>> m = re.search(r'(?<=FW:).*?(?=\.)', str)
>>> m.group()
'MKEQP2B4'

regex doesn't do what I think I tell it to do

3 Answers3