1

Here's a starting DataFrame:

ipdb> df[["line_amount","modifiedAmount"]]
   line_amount modifiedAmount
0        30.00               
1         2.88           2.88
2       199.20          199.2
3      -105.00           -104
4       150.00            150
5        75.00               
6      -450.00           -450
7        16.13          16.13
8        20.00               
9       111.99         111.99

What I want is a new column of data (or really to replace the modifiedAmount column with one) that contains "" in cases where the original modifiedAmount was EITHER:

  • already "" OR
  • equal to line_amount

I'm having such trouble figuring out how to accomplish what I'd've expected to be very easy!

I can get this:

ipdb> equal_test = df.modifiedAmount == df.line_amount
ipdb> blank_test = df.modifiedAmount == ""

but I can't do this:

ipdb> blank_test and equal_test
*** ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

I see this option when I want to apply a scalar result, but I couldn't figure out how to put df into lambda like this:

ipdb> df.modifiedAmount.apply(lambda x: "" if x == df.line_amount else x)
*** NameError: global name 'df' is not defined

Any ideas?

The desired result look like this:

ipdb> df[["line_amount","modifiedAmount"]]
   line_amount modifiedAmount
0        30.00               
1         2.88         
2       199.20         
3      -105.00         -104.00
4       150.00         
5        75.00               
6      -450.00         
7        16.13         
8        20.00               
9       111.99         

(yes, ideally I want to cast any remaining values to a float to two decimal places)

Community
  • 1
  • 1
HaPsantran
  • 5,581
  • 6
  • 24
  • 39

2 Answers2

0

you can use apply column-wise on the whole dataframe.

import pandas as pd
import numpy as np

Create some dummy data and put it in a dataframe. I used np.nan instead of "".

df =pd.DataFrame( { 'lineAmount':[30.00,2.88,199.20,-105.00,150.00,75.00,-450.00,16.13,20.00,111.99], \
                'modifiedAmount':[np.nan,2.88,199.20,-104.00,150.00,np.nan,-450.00,16.13,np.nan,111.99]})

Then you can use a lamda function of the whole dataframe, column-wise using the axis=1 parameter of apply():

df['modifiedAmount'] =df.apply(lambda x: np.nan if x.modifiedAmount == x.lineAmount else x.modifiedAmount, axis =1)

output:

    lineAmount  modifiedAmount
0   30.00       NaN
1   2.88        NaN
2   199.20      NaN
3   -105.00     -104
4   150.00      NaN
5   75.00       NaN
6   -450.00     NaN
7   16.13       NaN
8   20.00       NaN
9   111.99      NaN
JAB
  • 12,401
  • 6
  • 45
  • 50
0

Building the dataset. Note that i've entered all numbers as float (they seem to be strings in your code)

import pandas as pd
s_dict = {'line_amount': [30, 2.88, 199.2, -105, 150, 75, -450, 16.13, 20, 111.99], 'modifiedAmount': [None,2.88,199.2,-104, 150, None, -450, 16.13, None, 111.99]}
df = pd.DataFrame.from_dict(s_dict)
print df

output :

   line_amount  modifiedAmount
0        30.00             NaN
1         2.88            2.88
2       199.20          199.20
3      -105.00         -104.00
4       150.00          150.00
5        75.00             NaN
6      -450.00         -450.00
7        16.13           16.13
8        20.00             NaN
9       111.99          111.99

This line requires a bit of explanation. Here we are using boolean masks (df.modifiedAmount == df.line_amount and pd.isnull(df.modifiedAmount)), separated by a | (OR comparison), and the ~ at the beginning means NOT.

df['new_mod'] = df.loc[~((df.modifiedAmount == df.line_amount) | (pd.isnull(df.modifiedAmount))), 'modifiedAmount']
print df

output :

   line_amount  modifiedAmount  new_mod
0        30.00             NaN      NaN
1         2.88            2.88      NaN
2       199.20          199.20      NaN
3      -105.00         -104.00     -104
4       150.00          150.00      NaN
5        75.00             NaN      NaN
6      -450.00         -450.00      NaN
7        16.13           16.13      NaN
8        20.00             NaN      NaN
9       111.99          111.99      NaN
knightofni
  • 1,906
  • 3
  • 17
  • 22