been struggling with this for hours now, just can't seem to get my head around regex for some reason.
I'm looking through the strings below line by line using this pattern:
pattern = re.compile(r"^[^&,]*")
The strings are kept in a dictionary so looping over them like this:
for dct in lst:
print(re.search(pattern, dct['artist']).group(0))
"""
Drake
Post Malone Featuring Ty Dolla $ign
BlocBoy JB Featuring Drake
Offset & Metro Boomin
Jay Rock, Kendrick Lamar, Future & James Blake
"""
The above gives me this as expected:
"""
Drake
Post Malone Featuring Ty Dolla $ign
BlockBoy JB Featuring Drake
Offset
Jay Rock
"""
But I cannot figure out how to get add that it should also stop at the string "Featuring", I've tried different a 100 variations of \bFeaturing\b, capital B
, different tokens in front, back, positions in the regex
.
This is the closest I've gotten, but then it only matches the lines that have "Featuring":
pattern = re.compile(r"^[^&,]*(?=\bFeaturing\b)")
This gives me this output:
None
<_sre.SRE_Match object; span=(0, 12), match='Post Malone '>
<_sre.SRE_Match object; span=(0, 11), match='BlocBoy JB '>
None
<_sre.SRE_Match object; span=(0, 12), match='Post Malone '>
None
I'm fairly new to this so most of what I'm doing is trial and error, but I'm on the verge of giving up. Please help me get a result like this:
"""
Drake
Post Malone
BlockBoy JB
Offset
Jay Rock
"""