5

I want to filter out data from a dataframe using multiple conditions using multiple columns. I tried doing so like this:

arrival_delayed_weather = [[[flight_data_finalcopy["ArrDelay"] > 0]] & [[flight_data_finalcopy["WeatherDelay"]>0]]]
arrival_delayed_weather_filter = arrival_delayed_weather[["UniqueCarrier", "AirlineID"]]
print arrival_delayed_weather_filter

However I get this error message:

TypeError: unsupported operand type(s) for &: 'list' and 'list'

How do I solve this?

Thanks in advance

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Deepak M
  • 1,124
  • 2
  • 18
  • 28

1 Answers1

12

You need () instead []:

arrival_delayed_weather = (flight_data_finalcopy["ArrDelay"] > 0) & 
                           (flight_data_finalcopy["WeatherDelay"]>0)

But it seems you need ix for selecting columns UniqueCarrier and AirlineID by mask - a bit modified boolean indexing:

mask = (flight_data_finalcopy["ArrDelay"] > 0) & 
        (flight_data_finalcopy["WeatherDelay"]>0)
arrival_delayed_weather_filter=flight_data_finalcopy.ix[mask, ["UniqueCarrier","AirlineID"]]

Sample:

flight_data_finalcopy = pd.DataFrame({'ArrDelay':[0,2,3],
                                      'WeatherDelay':[0,0,6],
                                      'UniqueCarrier':['s','a','w'],
                                      'AirlineID':[1515,3546,5456]})

print (flight_data_finalcopy)
   AirlineID  ArrDelay UniqueCarrier  WeatherDelay
0       1515         0             s             0
1       3546         2             a             0
2       5456         3             w             6

mask = (flight_data_finalcopy["ArrDelay"] > 0) & (flight_data_finalcopy["WeatherDelay"]>0)
print (mask)
0    False
1    False
2     True
dtype: bool

arrival_delayed_weather_filter=flight_data_finalcopy.ix[mask, ["UniqueCarrier","AirlineID"]]
print (arrival_delayed_weather_filter)
  UniqueCarrier  AirlineID
2             w       5456
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Thanks ! , But what if i want to perform math operations like adding two columns of data? is it done the same way? @jezrael – Deepak M Nov 09 '16 at 15:50
  • I am not sure if understand, but maybe need [`concat`](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.concat.html). I think the best is create new question - small advice - [How to make good reproducible pandas examples](http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – jezrael Nov 09 '16 at 15:54
  • @DeepakM jezrael answered the question you posted. If you have a different question now, it would be best to ask another quesiton. – piRSquared Nov 09 '16 at 16:12
  • ok, i tried that code exactly the way you have it with ix it however threw an error: " ValueError: too many values to unpack" @jezrael – Deepak M Nov 10 '16 at 05:01
  • @DeepakM - sorry, there was typo. Now I add sample to solution, please check it. – jezrael Nov 10 '16 at 06:24
  • Thanks man ! worked like a charm, .loc i should have thought of that ! @jezrael – Deepak M Nov 10 '16 at 07:09
  • Ya, you can use `ix` or `loc`, there is not problem. But if check [revison 2](http://stackoverflow.com/posts/40510840/revisions) you need remove `[]` from `mask`. – jezrael Nov 10 '16 at 07:12