0

df0 = df.filter(['ShipmentID', 'AccessorialCode', 'Cost'])

    ShipmentID   AccessorialCode     Cost
0      141687945             LNH    68.65
2      141687945             LNH     7.05
3      141687951             LNH   331.13
5      141687951             LNH    20.97
7      141687951             LNH     7.05
8      141687955             LNH    67.45
9      141687955             LNH    31.75
11     141687955             LNH    47.26
13     141687955             LNH     6.91
14     141687964             LNH   203.13
15     141687964             LNH   229.07
18     141687964             LNH     8.94
19     141687969             LNH  1912.80
22     141687969             LNH    28.29
23     141687972             LNH    65.27
24     141687972             LNH    41.28

My goal is to write a logic loop that adds the costs of duplicate ShipmentID's and then create a new dataframe removing the duplicated accessorials.

df00= df0.loc[df0['AccessorialCode'] == 'LNH']
duplicates = df00[df00.duplicated(['ShipmentID'])]
# I then need to preform an operation to eliminate the duplicates and consolidate the cost

 The desired output should look like this:


    ShipmentID    AccessorialCode  Cost
0   141687945      LNH              75.7
2   141687951      LNH              395.15  
3   141687955      LNH              153.37
  • 1
    please post the expected output and what you have tried so far – Vivek Kalyanarangan Sep 11 '20 at 19:57
  • Standby... I'm working on trying to find the right way to go about it. – Adam Zuckerman Sep 11 '20 at 20:05
  • Please, after taking the [tour](https://stackoverflow.com/tour) (you have some rewards for that), please have a look at [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples). Then, edit the question adding the *desired output*. – Alexandre B. Sep 11 '20 at 20:06
  • Are you just trying to sum by ShipmentId? – PyNEwbie Sep 12 '20 at 01:19

2 Answers2

0

I think a group by would be what you need.

df0.groupby(['ShipmentID','AccessorialCode']).agg({'Cost':'sum'})
EfehanD
  • 26
  • 4
  • Ive tried that direction. There are 12 different values for accessorial codes. I separated LNH from the rest of the codes because it is the only one that contains different costs for the same shipping ID. My objective is to get one LNH cost for one ShippingID. – Adam Zuckerman Sep 11 '20 at 20:19
0
data = []
for shipment_id in set(df['ShipmentID'].values):
    row = {
        'ShipmentID':shipment_id,
        'AccessorialCode': 'LNH',
        'Cost': sum(df[df['ShipmentID'] == shipment_id]['Cost'].values),
    }
    data.append(row)

df0 = df(data)
    
Moss
  • 357
  • 1
  • 6
  • @AdamZuckerman If the answer solved your problem, instead of posting "Thank you" comments, Stack Overflow has its own format for [marking accepted answers](https://stackoverflow.com/help/accepted-answer). See also help page on [What should I do when someone answers my question?](https://stackoverflow.com/help/someone-answers). – Gino Mempin Sep 12 '20 at 03:50