i have to data frame
id-input id-output Date Price Type
1 3 20/09/2020 100 ABC
2 1 20/09/2020 200 ABC
2 1 21/09/2020 300 ABC
1 3 21/09/2020 50 AD
1 2 21/09/2020 40 AD
I want to get this Output :
id-inp-ABC id-out-ABC Date-ABC Price-ABC Type-ABC id-inp-AD id-out-AD Date-AD Price-AD Type-AD
1 3 20/09/2020 10 ABC 2 1 20/09/2020 10 AD
1' 3 20/09/2020 90 ABC Nan Nan Nan Nan Nan
2 1 20/09/2020 40 ABC 1 2 21/09/2020 40 AD
2' 1 20/09/2020 160 ABC Nan Nan Nan Nan Nan
2 1 21/09/2020 300 ABC Nan Nan Nan Nan Nan
My idea is to :
-divide the dataframe into two dataframes by type -iterate through the both dataframes and check if the same id-input == id-output
-check if the price is equal , if not split row and soustract the price. rename the columns and merge them.
grp = df.groupby('type')
transformed_df_list = []
for idx, frame in grp:
frame.reset_index(drop=True, inplace=True)
transformed_df_list.append(frame.copy())
ABC = pd.DataFrame([transformed_df_list[0])
AD = pd.DataFrame([transformed_df_list[1])
for i , row in ABC.iterrows():
for i, row1 in AD.iterrows():
if row['id-inp'] == row1['id-out']:2
row_df = pd.DataFrame([row1])
row_df= row_df.rename(columns={'id-inp': 'id-inp-AD', 'id-out':'id-out-AD' , 'Date':'Date-AD' ,'price':'price-AD'})
output = pd.merge(ABC.set_index('id-inp' , drop =False) ,row_df.set_index('id-out-AD' , drop =False), how='left' , left_on =['id-inp'] , right_on =['id-inp-AD' ])
but the results is Nan in the id-inp-AD id-out-AD Date-AD Price-AD Type-AD
part ,
and row_df contains just the last row :
1 2 21/09/2020 40 A
i want also that the iteration respect the order and each insert in the output dataframe is sorted by date.