So, I have the following two dataframes and my ideal output is to get open_orders reduced by cancel_orders so I know how many open_orders I have.
Desired Output:
df_total_orders
order_id business_symbol open_orders
0 a1b2c3111111 AA 0.0
1 4kl3l2242244 AAA 0.0
2 21312a224144 BBB 0.0
3 124f32rv2323 CCC 0.0
4 2412dfe21edf DDD 0.0
5555
What I have now:
df_add_orders (this dataframe contains duplicates in the order id column)
order_id business_symbol open_orders
0 a1b2c3111111 AA 100.0
1 4kl3l2242244 AAA 1000.0
2 21312a224144 BBB 1000.0
3 124f32rv2323 CCC 1000.0
4 2412dfe21edf DDD 1000.0
10000
df_cancel_orders (this dataframe contains duplicates in the order id column)
order_id cancel_orders
0 a1b2c3111111 100.0
1 4kl3l2242244 1000.0
2 21312a224144 1000.0
3 124f32rv2323 1000.0
4 2412dfe21edf 1000.0
4000
I am using the following group by to get the total open orders, but it's removing my symbols with the output.
df_add_orders_group = df_add_orders.groupby(['order_id'], as_index=False)['open_orders'].sum()
df_add_orders_group
order_id open_orders
0 a1b2c3111111 110.0
1 4kl3l2242244 1200.0
2 21312a224144 1500.0
3 124f32rv2323 1500.0
4 2412dfe21edf 1500.0
5000
I am then subtracting closed orders
df_cancel_orders_group = df_cancel_orders.groupby(['order_id'], as_index=False)['cancel_orders'].sum()
but I want to keep the symbols so I can compare the total open orders and I need some way to merge the result of the subtraction with the main df_add_orders and also clean up order_id to sum duplicates.