I have one sentence.
en | ko |
---|---|
Neonatal(disseminated) listeriosis(P37.2) | 신생아(파종성) 리스테리아증(P37.2) |
i want this result
en | ko |
---|---|
Neonatal listeriosis | 신생아 리스테리아증 |
or
en | ko |
---|---|
Neonatal(disseminated) listeriosis | 신생아(파종성) 리스테리아증 |
I used the following regex
regex = "\(.*\)|\s-\s.*"
df['en'] = re.sub[regex,'',df['en')
df['ko'] = re.sub[regex,'',df['ko')
but result
en | ko |
---|---|
Neonatal | 신생아 |
My guess is that it recognizes the words before (and at the very end) the words disseminated and deletes them all. I want to delete only the code(P37.2) at the end. Help