I have the following dataframe:
Code to recreate above:
input_lst = [[27141, 0, 0, 2081.39, np.nan, np.nan, '31/05/2025', '31/03/2021'],
[26142, 401.04, 1934.52, 0, np.nan, np.nan, '01/04/2021', '20/11/2009'],
[27748, 0, 0, 266.09, np.nan, np.nan, '18/01/2011', '30/04/2005'],
[26742, 0, 990.48, 0, np.nan, np.nan, '21/06/2011', '27/06/2008'],
[27564, 0, 1173.24, 466.33, np.nan, np.nan, '10/06/2004', '31/12/2004']]
input_headers = ['Ref', 'ABC', 'DEF', 'GHI', 'JKL', 'MNO', 'Commence Date 1', 'Commence Date 2']
test_df = pd.DataFrame(input_lst, columns=input_headers)
I want to reshape the dataframe or something similar, so my resulting dataframe looks like this:
Code to view resulting dataframe:
res_lst = [[27141, 2081.39, 'GHI', '31/03/2021'],
[26142, 401.04, 'ABC', '01/04/2021'],
[26142, 1934.52, 'DEF', '01/04/2021'],
[27748, 266.09, 'GHI', '30/04/2005'],
[26742, 990.48, 'DEF', '21/06/2011'],
[27564, 1173.24, 'DEF', '10/06/2004'],
[27564, 466.33, 'GHI', '31/12/2004']]
res_headers = ['Ref', 'Amount', 'Type', 'Commence Date']
result_df = pd.DataFrame(res_lst, columns=res_headers)
For each 'Ref' row, there can be up to 5 columns (ABC, DEF, GHI, JKL, MNO) in the original dataframe. The figures held in those columns in the original dataframe will be output to a column called 'Amount', but the 'Type' column is essentially what is the corresponding header name for those amounts. These also need to have a 'Commence Date' based on a condition. If original 'Type' is in either what is held under ABC or DEF, the resulting dataframe should output what is in 'Commence Date 1', otherwise if it's GHI/JKL/MNO, it should use 'Commence Date 2'.