11

So I need to match strings that are surrounded by |. So, the pattern should simply be r"\|([^\|]*)\|", right? And yet:

>>> pattern = r"\|([^\|]*)\|"
>>> re.match(pattern, "|test|")
<_sre.SRE_Match object at 0x10341dd50>
>>> re.match(pattern, "  |test|")
>>> re.match(pattern, "asdf|test|")
>>> re.match(pattern, "asdf|test|1234")
>>> re.match(pattern, "|test|1234")
<_sre.SRE_Match object at 0x10341df30>

It's only matching on strings that begin with |? It works just fine on regex101 and this is python 2.7 if it matters. I'm probably just doing something dumb here so any help would be appreciated. Thanks!

mdong
  • 345
  • 3
  • 10

3 Answers3

9

re.match will want to match the string starting at the beginning. In your case, you just need the matching element, correct? In that case you can use something like re.search or re.findall, which will find that match anywhere in the string:

>>> re.search(pattern, "  |test|").group(0)
'|test|'

>>> re.findall(pattern, "  |test|")
['test']
David542
  • 104,438
  • 178
  • 489
  • 842
7

In order to reproduce code that runs on https://regex101.com/, you have to click on Code Generator on the left handside. This will show you what their website is using. From there you can play around with flags, or with the function you need from re.

Note:

import re

regex = r"where"

test_str = "select * from table where t=3;"

matches = re.finditer(regex, test_str, re.MULTILINE)

for matchNum, match in enumerate(matches, start=1):

    print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))

    for groupNum in range(0, len(match.groups())):
        groupNum = groupNum + 1

        print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = match.start(groupNum), end = match.end(groupNum), group = match.group(groupNum)))
louis_guitton
  • 5,105
  • 1
  • 31
  • 33
3

Python offers two different primitive operations based on regular expressions: re.match() checks for a match only at the beginning of the string, while re.search() checks for a match anywhere in the string (this is what Perl does by default).

Document

宏杰李
  • 11,820
  • 2
  • 28
  • 35