1

I have been trying to figure this out but my limited regex knowledge is getting in the way. I am wondering if we can use regex/python to remove alphanumeric strings from a given URL. These alphanumeric strings will only contain a through f and 0 to 9. For example:

/cab/user/core1/bdc49fd8/bd77de6ce

I want to use regex to get:

/cab/user/core1

I have this working where I can take off the last alphanumeric string, but it fails when there are more than one in the same URL:

import re
print(re.sub(r'\/[a-f0-9]*$', ' ', "/cab/user/core1/bdc49fd8"))

results in:

/cab/user/core1 

but:

import re
print(re.sub(r'\/[a-f0-9]*$', ' ', "/cab/user/core1/bdc49fd8/bd77de6ce"))

results in:

/cab/user/core1/bdc49fd8 

Is there a way to remove all occurrences of the specific alphanumeric pattern from the URL?

sameer
  • 163
  • 2
  • 6

1 Answers1

1

You may use

import re
print(re.sub(r'(?:/[a-fA-Z0-9]*)+$', '', "/cab/user/core1/bdc49fd8/bd77de6ce"))
# => /cab/user/core1

See the Python demo and the regex demo.

The (?:/[a-f0-9]*)+$ pattern matches 1 or more repetitions of a sequence of patterns defined in the non-capturing group, / and then 0+ alphanumeric chars, and after them the end of string position should follow.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563