I am trying to remove some text from a string. What I want to remove could be any of the examples listed below. Basically any combination of uppercase and lowercase, any combination of integers at the end, and any combination of letters at the end. There could also be a space between or not.
- (Disk 1)
- (Disk 5)
- (Disc2)
- (Disk 10)
- (Part A)
- (Pt B)
- (disk a)
- (CD 7)
- (cD X)
I have a method already to get the beginning "(type"
multi_disk_search = [ '(disk', '(disc', '(part', '(pt', '(prt' ]
if any(mds in fileName.lower() for mds in multi_disk_search): #https://stackoverflow.com/a/3389611
for mds in multi_disk_search:
if mds in fileName.lower():
print(mds)
break
That returns (disc
for example.
I cannot just split by the parenthesis because there could be other tags in other parenthesis. Also there is no specific order to the tags. The one I am searching for is typically last; however many times it is not.
I think the solution will require regex, but I'm really lost when it comes to that.
I tried this, but it returns something that doesn't make any sense to me.
regex = re.compile(r"\s*\%s\s*" % (mds), flags=re.I) #https://stackoverflow.com/a/20782251/11214013
regex.split(fileName)
newName = regex
print(newName)
Which returns re.compile('\\s*\\(disc\\s*', re.IGNORECASE)
What are some ways to solve this?