0

I get two configure strings:

  1. jdbc:mysql://127.0.0.1:3306/DataBaseName
  2. jdbc:mysql://127.0.0.1:3306/DataBaseName?encoding=utf8

I want to extract the DataBaseName field from them, so I used code bellow to match that field:

match = re.search(r'(?<=/)(\w+)(?=[\?$])', config_str)
if match is not None and len(match.groups()) > 0:
    print match.groups()[1]

But it did not work.

I tried and confirmed that:

  • (?<=/)(\w+)(?=$) matches jdbc:mysql://127.0.0.1:3306/DataBaseName
  • (?<=/)(\w+)(?=\?) matches jdbc:mysql://127.0.0.1:3306/DataBaseName?encoding=utf8

So I think the reason is special character does not work in square bracket.

Does this true? And how can I make my code work?

WKPlus
  • 6,955
  • 2
  • 35
  • 53
  • Indeed, `$` *is not special* in a character class. See [Python regex - why does end of string ($ and \Z) not work with group expressions?](http://stackoverflow.com/q/12763548) – Martijn Pieters Jun 16 '14 at 08:37
  • `(?=(?:\?|$))` would match either a question mark or the end in the look-ahead. – Martijn Pieters Jun 16 '14 at 08:39
  • Yes, `(?=(?:\?|$))` works, and `(?:\?|$)` works too. I found a explanation here: https://docs.python.org/2/library/re.html, said character classes such as `\w` or `\S` are accepted inside a set. But did not mention characters like `$` or `\Z`. So I think most special characters (expect `\w` and `\S`) are not supported in a character set. – WKPlus Jun 16 '14 at 08:58
  • That's because `\S` and `\w` are *predefined character sets*. You can get the same with `[^ \t\r\n]` or `[a-zA-Z0-9_]`, respectively. – Martijn Pieters Jun 16 '14 at 09:04

0 Answers0