0

What would be the correct syntax to use with the inline (?aiLmsux) flags in the re module? For example:

string = 'Hello There'
re.search(r'^(?i[a-z]+)(\s[a-z]+)?', string)

The above is an invalid python expression, but basically I would like the first [a-z]+ to do a case-insensitive match, so the matched string will be "Hello".

The closest I have been able to get is:

>>> re.search(r'^(?i)([a-z]+)(\s[a-z]+)?', string).group()
'Hello There'

# also, not what I want
>>> re.search(r'^([a-z]+)(\s[a-z]+)?', string, re.I).group()
'Hello There'

But this flag is working on the entire string, and not just the first [a-z]+ part. How could I limit the scope of the ?i ?

Update: Note, the linked duplicate shows how to use the re.I flag as well as the (?i) flag on an entire string, but I'm looking how (if?) it's possible to only apply that flag on a grouped sub-expresssion.

The equivalent regex should be:

# Only the first part -- [a-zA-Z] is made case-insensitive
>>> re.search(r'^[a-zA-Z]+(\s[a-z]+)?', string).group()
'Hello'
samuelbrody1249
  • 4,379
  • 1
  • 15
  • 58
  • @b_c the answer there is using a flag, also I'd only like it to work on part of a string. – samuelbrody1249 Nov 11 '19 at 19:49
  • Gotcha, missed the "partial" part :) – b_c Nov 11 '19 at 19:51
  • The marked duplicate here is not a complete answer as of 11/2019. Does not tell the complete inline modifier usage in Python. I recommend @WiktorStribiżew not point (as _duplicates_) all the Python inline modifier questions to that single answer. –  Nov 12 '19 at 03:03

1 Answers1

3

This will (should) isolate the flag to the local group (?flag: )

^(?i:[a-z]+)(\s[a-z]+)?

Seems to work in Python 3.7.3

Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 21:26:53) [MSC v.1916 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> re.match(r'^(?i:[a-z]+)(\s[a-z]+)?','Hello There').group()
'Hello'
>>>

But has a problem on regex101.com version of Python (whatever that is)

https://regex101.com/r/JZ79ul/1

Note also this scenario, which can overcome Python's shortcomings via
modifiers.

 >>> import re
 >>> re.search(r"(?-i:ab(?i:cd)ef(?i:gh))", "abCDeFgh abCDefGH")
 <re.Match object; span=(9, 17), match='abCDefGH'>

There is more to the story of course which aren't addressed anywhere on SO
apparently.
It seems a waste of time to tell it here, in a post that is buried now.