1

I am trying to play with regular expressions in python. I have framed regular expression as given below. I know that ^ is used to match at the beginning of search string. I have framed by match pattern which contains multiple ^, but I am not sure about how re will try to match the pattern in search string.

re.match("^def/^def", "def/def")

I was expecting that re will be raising error, regarding invalid regular expression, but it doesn't raise any error and returns no matches.

So, my questions is "^def/^def" or "$def/$def" a valid regular expression ?

Mangu Singh Rajpurohit
  • 10,806
  • 4
  • 68
  • 97
  • Doesn't your experiment demonstrate that the answer is "yes"? Note that those characters can also refer to the start and end of *lines*, in multiline mode. – jonrsharpe Mar 12 '18 at 09:04
  • Actually you can create lots of nonsense regular expressions. Putting "start of input" (`^`) in the middle is just one of many possibilities. Another would be sth like `()*` (empty string repeated any number of times). Some nonsense-patterns are found and complained about (e. g. `a{6,3}`, `a**`), others are silently accepted and will either match anything or nothing, depending on their nature. – Alfe Mar 12 '18 at 09:11

1 Answers1

5

You do not have an invalid regular expression, ^ has legal uses in the middle of a string. When you use the re.M flag for example:

When specified, the pattern character '^' matches at the beginning of the string and at the beginning of each line (immediately following each newline); and the pattern character '$' matches at the end of the string and at the end of each line (immediately preceding each newline).

It is also possible to create patterns with optional groups, where a later ^ would still match if all of the preceding pattern matched the empty string. Using the ^ in places it can't match is not something the parser checks for and no error will be raised.

Your specific pattern will never match anything, because the ^ in the middle is unconditional and there is no possibility that the / preceding it will ever match the requisite newline character, even if the multiline flag was enabled.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Thanks Sir for such detailed information. Can you please let me know, about the cases, where it's possible to create optional groups and have ^ match the middle of string, which you have mentioned in your answer. – Mangu Singh Rajpurohit Mar 12 '18 at 11:57
  • 1
    @ManguSinghRajpurohit; I meant that a pattern can have the `^` in the middle and still match. `r'[abc]*^start'` can match, for example, because the pattern before the `^` can match the empty string. – Martijn Pieters Mar 12 '18 at 20:16