2

I'm fairly new to regex (regular expressions) and need a little bit of help formulating a string. I understand it for the most part but got stumped when the text I needed to match had variables followed by an optional phrase.

Say the text is formatted something like "turn $1 [the] lights" where "$1" is the variable I want while "the" can be included or left out. I've tried the following blurb, "turn (.+) (?:the)?\s*lights", which works for "turn on lights":

>>> re.match("turn (.+) (?:the)?\s*lights", "turn on lights").groups()
("on",)

But when I include the "the" and try to match "turn on the lights", I get "on the" as my variable.

>>> re.match("turn (.+) (?:the)?\s*lights", "turn on the lights").groups()
("on the",)

Is this something that can be accomplished with the regex library? I apologize if the question is unclear, thank you in advance!

spatel4140
  • 383
  • 3
  • 10

2 Answers2

3

You just need to use a lazy quantifier for this:

turn (.+?) (?:the)?\s*lights

RegEx Demo

anubhava
  • 761,203
  • 64
  • 569
  • 643
1

If I understand the question correctly, you're trying to write a regex that will match phrases that include, but are not limited to the following:

  • turn on the lights
  • turn off the lights
  • turn on lights
  • turn off lights

A regex that does this can be written as such:

turn (.+?) (the )?lights

Broken down by part:

  • turn: straightforward.
  • (.+?): captures one or more of any character, but the ? indicates for this to not get greedy. You can read about greediness here
  • (the )? is surounded by parentheses to make the ? apply to the whole word, making the entirety of the optional. This will also create a capture group, but you can safely ignore this fact.
  • lights: straightforward.
Community
  • 1
  • 1
Austin Dean
  • 384
  • 3
  • 13