0

I'm trying to use np.where to filter the column speed and where a condition is met assign a value in column 'C'. For some reason I'm assigning everything as true. It seems to work fine for this person SO question so i'm a little stumped. Any help would be appreciated.

df["C"] = np.where(df.speed > 3, 'true','false')

   speed   C  
0  3.34    true  
1  0.02    true  
2  0.01    true  
3  8.41    true  
4  0.03    true  
Community
  • 1
  • 1
hselbie
  • 1,749
  • 9
  • 24
  • 40
  • 1
    Works for me as expected. You're not showing the code behind the dataframe creation itself, but that's probably where the error is. Make a self-contained example that shows the problem. –  Jan 27 '16 at 19:15
  • Sorry, should have checked this, Speed is listed as an `object`, not an `int`. I knew it was something silly. – hselbie Jan 27 '16 at 19:18

2 Answers2

1

Your variables for speed aren't integers, so this test is failing. It is possible they are objects. If the dataframe is properly initialized with integers, this works as expected.

For example, I can get the output you have here if I make all of the datatypes of the speed column into strings.

Alex Alifimoff
  • 1,850
  • 2
  • 17
  • 34
0

For anyone encountering this, a wise commenter up above suggested I look at how the dataframe was created, so I looked at the dtypes and found

id                       int64
speed                   object
C                       object

This code fixes the problem:

df['speed'] = df['speed'].astype(float)
hselbie
  • 1,749
  • 9
  • 24
  • 40