I'm a student working on a data science project and I need to extract a part from one column of my dataframe. The dataframe looks like this : column.
I want to extract the part HOTHOTVIDEO from a string like "HOTHOTVIDEOHOT0501005107FilmVidéoClub"
So I wrote this instruction using a regex like this :
facturation['annotation']=facturation['annotation'].str.findall('([A-Z0-9]{3}\d+)').apply(''.join)
It extracts everything correclty, except sometimes when I have strings like these : "CTVCANALVODCTV0200052670CTV0200052670", it returns CTV0200052670CTV0200052670, but only want the first occurence: Like this
Can someone help me to fix this issue please :)