-1

How can i modify my existing regex to make it remove all the leading characters that are either a digit or an underscore.

re.sub('^(\d+|_).*', '', n, flags=re.IGNORECASE)

# test strings
0001_Smoke_B_B
0002_Smoke_B_B
0012_Smoke_B_B
MA103
MA104
00_00MA105

The end goal should output these

Smoke_B_B
Smoke_B_B
Smoke_B_B
MA103
MA104
MA105
JokerMartini
  • 5,674
  • 9
  • 83
  • 193
  • 1
    maybe you should use `*` without `.` ? – furas Dec 18 '20 at 00:36
  • you could format data into code which we could simple copy and run. You could create list with examples and `for`-loop which test every string with regex – furas Dec 18 '20 at 00:38
  • 1
    Use a character class instead of an alternation, it's more efficient: `re.sub('^[\d_]+', '', n, flags=re.IGNORECASE)` – Nick Dec 18 '20 at 00:38
  • 1
    There's no need for a regex here. `str.lstrip` works fine. – TigerhawkT3 Dec 18 '20 at 01:29

2 Answers2

2

Regex for replace

^[\d_]+

^ This looks for beginning of string

[\d_] Character array with A digit or underscore

+ 1 or more times

Regex101

abc123
  • 17,855
  • 7
  • 52
  • 82
0

If you remove . then it will use * to search all digits and _

'^(\d+|_)*'

Testing code.

I also added '^[\d+_]+' from other answers/comments

import re

# test strings
examples = [
    '0001_Smoke_B_B',
    '0002_Smoke_B_B',
    '0012_Smoke_B_B',
    'MA103',
    'MA104',
    '00_00MA105'
]

for text in examples:
    result = re.sub('^(\d+|_)*', '', text, flags=re.IGNORECASE)
    print(text, '->', result)

# example from other answers and comments
for text in examples:
    result = re.sub('^[\d+_]+', '', text, flags=re.IGNORECASE)
    print(text, '->', result)
furas
  • 134,197
  • 12
  • 106
  • 148