0

I want to set a column of a Pandas DataFrame to True/False depending on whether the index for the DataFrame is in a set.

I can do it as follows:

import io

table = """
A,1,2
B,1,3
C,4,5
D,9,1
E,10,4
F,8,3
G,9,0
"""

df = pd.read_csv(io.StringIO(table), header=None, index_col=0)

fM7_notes = set(['F', 'A', 'C', 'E'])

df['in_maj_7'] = False
df.loc[fM7_notes, 'in_maj_7'] = True

However, what I wanted to write, instead of the last two lines, was

df['in_maj_7'] = df.index in fM7_notes

This seems more expressive, concise, and pythonic, but it also doesn't work:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-81-851b1efe0c36> in <module>()
----> 1 df['in_maj_7'] = df.index in fM7_notes

~/anaconda/lib/python3.6/site-packages/pandas/core/indexes/base.py in __hash__(self)
   2060 
   2061     def __hash__(self):
-> 2062         raise TypeError("unhashable type: %r" % type(self).__name__)
   2063 
   2064     def __setitem__(self, key, value):

TypeError: unhashable type: 'Index'

Is there a cleaner way?

sfjac
  • 7,119
  • 5
  • 45
  • 69
  • I don't think the referenced articles are exactly on point - I knew about using `Series.isin` for various operations and about proper indexing with a set (or list). Just didn't occur to me to use `Index.isin` on the RHS of the expression. But fortunately it was open long enough to get exactly what I needed. – sfjac Jun 29 '19 at 21:54

1 Answers1

1

With pandas.Index.isin() function:

In [31]: df['in_maj_7'] = df.index.isin(fM7_notes)

In [32]: df
Out[32]:
    1  2  in_maj_7
0
A   1  2      True
B   1  3     False
C   4  5      True
D   9  1     False
E  10  4      True
F   8  3      True
G   9  0     False
RomanPerekhrest
  • 88,541
  • 4
  • 65
  • 105