I have a sample DataFrame as follows:
track_id | track_date | status | status_info |
---|---|---|---|
track_1 | 2021-01-01 | approved | None |
track_2 | 2021-01-02 | None | accredited |
track_3 | 2021-01-03 | approved | accredited |
track_4 | 2021-01-04 | approved | approved |
track_5 | 2021-01-05 | approved|approved | accredited|cancelled |
track_6 | 2021-01-06 | None | accredited|cancelled |
And I need to split both status
and status_info
into rows, so it gives me an output similar to the one below:
track_id | track_date | status | status_info |
---|---|---|---|
track_1 | 2021-01-01 | approved | None |
track_2 | 2021-01-02 | None | accredited |
track_3 | 2021-01-03 | approved | accredited |
track_4 | 2021-01-04 | approved | approved |
track_5 | 2021-01-05 | approved | accredited |
track_5 | 2021-01-05 | approved | cancelled |
track_6 | 2021-01-06 | None | accredited |
track_6 | 2021-01-06 | None | cancelled |
I have already tried the code below, using this answer in another question as reference:
# splitting string values into lists
new_status = df['status'].str.split('|', expand=True).stack().reset_index(level=1, drop=True)
new_status_info = df['status_info'].str.split('|', expand=True).stack().reset_index(level=1, drop=True)
# generating a temporary DataFrame to join later (error here)
df_split = pd.concat([new_status, new_status_info], axis=1, keys=['status', 'status_info'])
# then, we join both DataFrames
df.drop(columns=['status','status_info'], axis=1).join(df_split).reset_index(drop=True)
But it gives me a ValueError:
ValueError: cannot reindex from a duplicate axis
When I modify .reset_index(level=1, drop=True)
to .reset_index(drop=True)
on the split step, the join operation only brings me one of the values, and not the expected two:
track_id | track_date | status | status_info |
---|---|---|---|
track_1 | 2021-01-01 | approved | None |
track_2 | 2021-01-02 | None | accredited |
track_3 | 2021-01-03 | approved | accredited |
track_4 | 2021-01-04 | approved | approved |
track_5 | 2021-01-05 | approved | cancelled |
track_6 | 2021-01-06 | None | accredited |