Creating new row values within same pandas dataframe based on difference of other row values

Question

Below is an existing df

           data = np.array([['','Market','Product Code','Week','Sales','Units'],
            ['Total Customers',123,1,500,400],
            ['Total Customers',123,2,400,320],
            ['Major Customer 1',123,1,100,220],
            ['Major Customer 1',123,2,230,230],
            ['Major Customer 2',123,1,130,30],
            ['Major Customer 2',123,2,20,10],
            ['Total Customers',456,1,500,400],
            ['Total Customers',456,2,400,320],
            ['Major Customer 1',456,1,100,220],
            ['Major Customer 1',456,2,230,230],
            ['Major Customer 2',456,1,130,30],
            ['Major Customer 2',456,2,20,10]])

            df =pd.DataFrame(data)

I wish to create new rows based on the value difference between the row value in the 'Market' column (Total Customers) and the row values in the 'Market' Column (Major Customer 1 + Major Customer 2). I wish to assign the new row value in 'Market' Column as 'Remaining Customers' and append within the same df.

Overall, I'm basically trying to work out the remaining Sales and Unit 'Gap' of the market

This is what I have tried so far using loc but I keep getting a key error. Can anyone help?

 df.loc[df['Market'] == 'Remaining Customers'] =          
                        df.loc[df['Market'] == 'Total Customers']-
                        (df.loc[df['Market'] == 'Major Customer 1']+df.loc[df['Market'] == 'Major Customer 2'])

Please provide an output table as well. Also, please refer to this SO post and provide the df in a suitable format: https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples — YOLO, Feb 27 '18 at 02:10

score 1 · Accepted Answer · answered Feb 27 '18 at 02:28

1

Please see this notebook for more details. https://nbviewer.jupyter.org/github/emican86/48999037/blob/master/48999037.ipynb

.loc is primarily label based. Data had to be aligned and labels set.

answered Feb 27 '18 at 02:28

emican

256
1
6

Thanks for sending this through but the code provided didn't work as intended. In Step '4c' subtracting the differences didn't work for me. In Step 4c, you mention that it should work because the indexes align -how can I check if the indexes are aligned? – jwlon81 Mar 05 '18 at 23:26

Creating new row values within same pandas dataframe based on difference of other row values

1 Answers1