1

I have a dataframe with 2 columns, and I want to create a 3rd column based on a comparison between the 2 columns.

So the logic is: column 1 val = 3, column 2 val = 4, so the new column value is nothing

column 1 val = 3, column 2 val = 2, so the new column is 3

It's a very similar problem to one previously asked but the answer there isn't working for me, using np.where()

Here's what I tried:

FinalDF['c'] = np.where(FinalDF['a']>FinalDF['b'],[FinalDF['a'],""])

and after that failed I tried to see if maybe it doesn't like the [x,y] I gave it, so I tried:

FinalDF['c'] = np.where(FinalDF['a']>FinalDF['b'],[1,0])

the result is always:

ValueError: either both or neither of x and y should be given

Edit: I also removed the [x,y], to see what happens, since the documentation says it is optional. But I still get an error:

ValueError: Length of values does not match length of index

Which is odd because they are sitting in the same dataframe, although one column does have some Nan values.

I don't think I can use np.select because I have a condition here. I've linked to the previous questions so readers can reference them in future questions.

Thanks for any help.

ZakS
  • 1,073
  • 3
  • 15
  • 27
  • 1
    Your condition describes different logic compared to code? It should be like this `np.where(FinalDF['a']>FinalDF['b'],"",[FinalDF['b']])`. – shivsn May 04 '18 at 09:12

1 Answers1

3

I think that this should work:

FinalDF['c'] = np.where(FinalDF['a']>FinalDF['b'], FinalDF['a'],"")

Example:

FinalDF = pd.DataFrame({'a':[4,2,4,5,5,4],
               'b':[4,3,2,2,2,4],
               })
print FinalDF
   a  b
0  4  4
1  2  3
2  4  2
3  5  2
4  5  2
5  4  4

Output:

   a  b  c
0  4  4   
1  2  3   
2  4  2  4
3  5  2  5
4  5  2  5
5  4  4   

or if the column b has to have a greater value of column a, use this:

FinalDF['c'] = np.where(FinalDF['a']<FinalDF['b'], FinalDF['b'],"")

Output:

   a  b  c
0  4  4   
1  2  3  3
2  4  2   
3  5  2   
4  5  2   
5  4  4   
Joe
  • 12,057
  • 5
  • 39
  • 55
  • the logic doesn't match the `np.where` condition. – shivsn May 04 '18 at 09:18
  • @shivsn example in the documentation https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.where.html : `np.where(x < 5, x, -1)` – Joe May 04 '18 at 09:20