import pandas as pd
test = pd.DataFrame({"report_number":[['21-10541', '21-6300'],[ '20-55506']],
"date": ['2021-06-03','2021-06-04']})
I have the above dataset, and as you can see the report number column sometimes contains multiple report numbers in a single row. using test
a dataframe as the input, I want to create an output dataframe as follows where each row is a single report number, and the date corresponding to that report number.
Desired Ouput
| report_number | date |
| 21-1054 | 2021-06-03|
| 21-6300 | 2021-06-03|
| 20-55506 | 2021-06-04|
I tried the following
def f(x):
if len(x['report_number'])>1:
for i in x['report_number']:
return(i,x['date'])
else:
return (x['report_number'],x['date'])
test1.apply(f,axis=1)
But it does not seem to work