How to generate a boolean "indicator Series" based on inclusion in a set?

Question

Here's the set-up for a toy example:

data = [['a',  1],
        ['b',  2],
        ['a',  3],
        ['b',  1],
        ['c',  2],
        ['c',  3],
        ['b',  1]]

colnames = tuple('XY')

df = pd.DataFrame(co.OrderedDict([(colnames[i],
                                   [row[i] for row in data])
                                  for i in range(len(colnames))]))

OK, to get a boolean indicator Series object (suitable for indexing) corresponding to whether the value in the X column is equal to 'a' or not, I can do this:

In [230]: df['X'] == 'a'
Out[230]:
0     True
1    False
2     True
3    False
4    False
5    False
6    False
Name: X, dtype: bool

Fine, but what I really want to do is to test whether the value is one of several possible values. I tried to use set inclusion for this, but it bombs:

In [231]: df['X'] in set(['a', 'b'])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-266-0819ab764ce2> in <module>()
----> 1 df['X'] in set(['a', 'b'])

/Users/yt/.virtualenvs/pd/lib/python2.7/site-packages/pandas/core/generic.pyc in __hash__(self)
    639     def __hash__(self):
    640         raise TypeError('{0!r} objects are mutable, thus they cannot be'
--> 641                         ' hashed'.format(self.__class__.__name__))
    642
    643     def __iter__(self):

TypeError: 'Series' objects are mutable, thus they cannot be hashed

How can I achieve this?

Note: for the situation I'm working with, the set of allowable values is large, and known only at run time, so an or expression is out of the question.

Are you just looking for `isin`, as described [here](http://stackoverflow.com/questions/19960077/in-and-not-in-for-pandas-dataframe)? — DSM, Sep 15 '14 at 17:38
@DSM: bingo, that was exactly it. I suppose this question is a duplicate... I'll delete it if so. If not, please post your comment as an answer so that I can give its due. — kjo, Sep 15 '14 at 17:51
http://stackoverflow.com/questions/17396898/index-a-python-pandas-dataframe-with-multiple-conditions-sql-like-where-statemen — Inox, Sep 15 '14 at 17:57

How to generate a boolean "indicator Series" based on inclusion in a set?

0 Answers0