0

I'm converting some php code to python. The php code has the following for matching a surname from a string:

 preg_match("/([^ ]*) *$/",$c['name'],$r);

which appears to just match whatever the last word in the string is.

Looking to convert it, I thought the following would do it:

 r = re.match('([\S]+)$', c['name'])

to my mind that should match any non-space characters before the end of the string. The logic is that $ matches the end of the string, and the ([\S]+) matches any non space character one or more times before the end of the string. However that doesn't work for me (despite a site like pythex.org suggesting it should).

I've managed to get it to work using the following, but I'd like to know why the above regex doesn't work, I suspect it has to do with back-capture but I'm not really familiar with how that works.For testing I'm just using c['name'] = 'John Doe'

 r = re.match('(?:.*\s)*([\S]+)', c['name'])

(in the above regex the (?:.*\s) is a non-capture group that matches repeated characters followed by a space-like character 0 or more times. It then captures any non-space-like characters that occur one or more times.)

Thanks.

Rob Murray
  • 1,773
  • 6
  • 20
  • 32
Doldge
  • 13
  • 3
  • 1
    `re.match` only searches for a match *at the beginning of a string*. You need to use `re.search` to search anywhere in the string. – Wiktor Stribiżew Nov 24 '15 at 11:08
  • @stribizhev thanks for that. I'm a little surprised to learn match only matches to the start of a string. That does explain some things though. – Doldge Nov 24 '15 at 12:10
  • Every language has its own set of methods to work with regex. You need to study documentation first, before using a regex, to avoid any confusion. `match` is an overly-used term, so be careful and do not trust what you may think it means. Java `String#matches()` and C++ `regex_match` expect the full string to match the pattern, while .NET `Regex.Match` acts as `re.search` in Python. – Wiktor Stribiżew Nov 24 '15 at 12:20

0 Answers0