I have a DataFrame in which one column is rows of strings that look like:
Received value 126;AOC;H3498XX from 602
Received value 101;KYL;0IMMM0432 from 229
I want to drop (or replace with nothing) the part after the second semicolon so that it looks like
Received value 126;AOC; from 602
But this part I want to drop will have varying and unpredictable lengths (always combinations of A-Z and 0-9). The semicolons and froms will always be there for reference.
I'm trying to use regex by studying this link: https://docs.python.org/3/library/re.html
import re
for row in df[‘column’]:
row = re.sub(‘;[A-Z0-9] from’ , ‘; from’, row)
I think the [A-Z0-9] fails to incorporate the varying length aspect I want.