Below is an existing df
data = np.array([['','Market','Product Code','Week','Sales','Units'],
['Total Customers',123,1,500,400],
['Total Customers',123,2,400,320],
['Major Customer 1',123,1,100,220],
['Major Customer 1',123,2,230,230],
['Major Customer 2',123,1,130,30],
['Major Customer 2',123,2,20,10],
['Total Customers',456,1,500,400],
['Total Customers',456,2,400,320],
['Major Customer 1',456,1,100,220],
['Major Customer 1',456,2,230,230],
['Major Customer 2',456,1,130,30],
['Major Customer 2',456,2,20,10]])
df =pd.DataFrame(data)
I wish to create new rows based on the value difference between the row value in the 'Market' column (Total Customers) and the row values in the 'Market' Column (Major Customer 1 + Major Customer 2). I wish to assign the new row value in 'Market' Column as 'Remaining Customers' and append within the same df.
Overall, I'm basically trying to work out the remaining Sales and Unit 'Gap' of the market
This is what I have tried so far using loc but I keep getting a key error. Can anyone help?
df.loc[df['Market'] == 'Remaining Customers'] =
df.loc[df['Market'] == 'Total Customers']-
(df.loc[df['Market'] == 'Major Customer 1']+df.loc[df['Market'] == 'Major Customer 2'])