I know capturing a date is usually a simple enough RegEx task, but I need this to be so specific that I'm struggling.
1 SUSTAINABLE HARVEST SECTOR | QUOTA LISTING JUN 11 2013
2 QUOTA
3 TRADE ID AVAILABLE STOCK AMOUNT PRICE
4 130196 COD GBW 10,000 $0.60
5 130158 HADDOCK GBE 300 $0.60
That is what the beginning of my Excel spreadsheet looks like, and what 100's more look like, with the date and the data changing but the format staying the same.
My thoughts were to capture everything that follows LISTING
up until the newline... then place the non numbers (JUN
) in my Trade Month column, place the first captured number (11
) in my Trade Day column, and place the last captured number (2013
) in my Trade Year column... but I can't figure out how to. Here's what I have so far:
pattern = re.compile(r'Listing(.+?)(?=\n)')
df = pd.read_excel(file_path)
print("df is:", df)
a = pattern.findall(str(df))
print("a:", a)
but that returns nothing. Any help solving this problem, which I know is probably super simple, is appreciated. Thanks.