0

I'm looking for pandas/python solution to summarize/group items in an invoice, based on it's main code. please refer the first attached image

each item value (price X qty) which has a MainCode should be added to the the item # total. Look for the Similar color. (Please Note some items may be repeated. eg : Green & Blue ) Qty SHOULD NOT be added. only the Values.

enter image description here

The answer should be as as follows enter image description here

I'm New to Pandas & Since This is a bit Advanced problem, I couldn't come up with any code. hence, posted here. please direct me to solve this...

Things to note

  1. Item combination can be repeated. eg see Green & Blue
  2. Once Done Sub items should be removed (eg item # 371 is removed).
  3. totalamt should equal to the total of individual item's price X Qty
  4. In this example all sub item codes are 371. but it may have several codes. eg 371, 58, etc
  5. Sub Item 371 can be sold separately without a MainCode. In that case, it should not be allocated & it should remain.

UPDATED DATA enter image description here

Sunni
  • 365
  • 1
  • 11

1 Answers1

2

Hope you enjoy it. Next time please provide input data not as images but as text.

import pandas as pd

data_raw = [[260, 1500, 3, 0, 4500], [260, 1500, 1, 0, 1500], [258, 1500, 4, 0, 6000], [1054, 1200, 1, 0, 1200],
[371, 700, 3, 260, 2100], [371, 700, 1, 260, 700], [371, 700, 1, 1054, 700], [371, 700, 4, 258, 2800]]

 data = pd.DataFrame(data_raw, columns=['item', 'price', 'qty','Main code','Total'])

remove_index= []
for index, row in data.iterrows():
     try:
         # find item in Main code
         main_code_data = data.loc[data['Main code'] == row['item']]
         # merge values
         data.at[index, 'Total'] = row['Total'] + row['qty']* (main_code_data['Total'].values[0]/main_code_data['qty'].values[0])
         # get indexes to remove
         for item in main_code_data.index:
            remove_index.append(item)
     except:
         # if no match
         pass

 # remove used lines
 data = data.drop(remove_index)

Output:

   item  price  qty  Main code  Total
0   260   1500    3          0   6600
1   260   1500    1          0   2200
2   258   1500    4          0   8800
3  1054   1200    1          0   1900
Zaraki Kenpachi
  • 5,510
  • 2
  • 15
  • 38
  • Thanks lot @Zaraki. Your answer is awesome. Really appreciate this.. There's one point i want to highlight, I'm not getting the correct Answer when there are duplicate/repetitive situation. refer my example color coded in GREEN & BLUE. (point # 1 in Things to Note section). In this situation, I'm not getting the desired output. how can i solve this? – Sunni May 27 '19 at 18:31
  • @Rapto you need to change index selection for values to select one element and add loop to collect all indexes. Code updated. Enjoy =^..^= – Zaraki Kenpachi Jun 01 '19 at 07:57
  • Thank you for the update. the updated code actually adds the first entry only. in the above example it works perfectly well. coz. both in the GREEN & BLUE lines, Item 260 has the same qty 3. i made a blunder in my data. in the BLUE line, i wanted to mention that the item repeats but qty may differ. example in GREEN line qty 3 for item 260 & in BLUE qty 1. similarly corresponding Item # 371 has 3 & 1 qty respectively. this should be the correct data. – Sunni Jun 01 '19 at 08:48
  • so here, do we need to remove the used lines as and when we add the data? rather than waiting until the loop ends. `# remove used lines` `data = data.drop(remove_index)` – Sunni Jun 01 '19 at 08:48
  • @Rapto Don't know exactly what you need? Provide proper start data and desired output. – Zaraki Kenpachi Jun 01 '19 at 08:57
  • I updated the original question (see UPDATED DATA section). sorry I don't know how to add the text data in proper table format. hence, I attached an image. – Sunni Jun 01 '19 at 09:00
  • Thanks a million..This is what I wanted. Great job.! – Sunni Jun 01 '19 at 09:32
  • Need your assistance on my [new question](https://stackoverflow.com/questions/57629485/speeding-up-pandas-code-by-replacing-iterrows) also mate. – Sunni Aug 23 '19 at 16:19