0

I'm working in python, and have a dataframe(EAD), which includes the column 'CLIENT'. There is also another column called NPV, which is made up of numbers.

      CLIENT                      NPV          
0     1                            50                

1     1                            90                   

2     2                            130                     

3     2                            -11   

4     2                            57

5     3                            105
...

...

My goal is to group by CLIENT and then sum the NPV's of a client together.

My expected result would be like so:

    CLIENT                         NPV                       
0     1                            140                  

1     2                            176                   

2     3                            105                 
    ...

I did this by using a groupby function, like so:

 EAD_CLIENT = EAD.groupby('COUNTERPARTYNAME')['NPV'].sum().reset_index()

However, when I did so, I got strange output NPV's that weren't correct math.

Output:

    CLIENT                         NPV                       
0     1                            140                  

1     2                            27                   

2     3                            -150                 
    ...

Now this is an example, and a tiny portion of my data, but it should illustrate my point. It's clear that for some reason the NPV's are simply not summing correctly. I next tried this:

EAD['SUM(NPV)'] = EAD.groupby('COUNTERPARTYNAME')['NPV'].transform('sum')

But that gave me the same result. I thought perhaps 'NPV' was still a string, so I tried several ways of turning it into an integer.

I added a line above EAD['SUM(NPV)'] that said:

EAD['NPV']=int(EAD['NPV']

but it claimed that TypeError: cannot convert the series to

I also tried:

EAD['NPV']=EAD['NPV'].astype(int).

That didnt work either however, and I'm really struggling to understand what I'm doing wrong.

TLDR; I can't figure out how to get Python to do the math right.

Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
throway172
  • 123
  • 2
  • 4
  • 12
  • 2
    `groupby('COUNTERPARTYNAME')`? Either you're asking something different, or you made a typo is what it looks like. – Mad Physicist Aug 14 '18 at 17:03
  • `EAD['NPV']=int(EAD['NPV']` is missing a closing parens. – Mad Physicist Aug 14 '18 at 17:04
  • I can't reproduce your issue because COUNTERPARTYNAME is not a column here. If you meant CLIENT, well that works. So, you might want to try `df['NPV'] = pd.to_numeric(df['NPV'], errors='coerce')` and try again. _That's_ how you convert to numeric. – cs95 Aug 14 '18 at 17:04

0 Answers0