19

Having a series like this:

ds = Series({'wikipedia':10,'wikimedia':22,'wikitravel':33,'google':40})

google        40
wikimedia     22
wikipedia     10
wikitravel    33
dtype: int64

I would like to select the rows where 'wiki' is a part of the index label (a partial string label).

For the moment I tried

ds[ds.index.map(lambda x: 'wiki' in x)]

wikimedia     22
wikipedia     10
wikitravel    33
Name: site, dtype: int64

and it does the job, but somehow the index cries for 'contains' just like what the columns have...

Any better way to do that?

cs95
  • 379,657
  • 97
  • 704
  • 746
ronszon
  • 2,997
  • 4
  • 19
  • 16

3 Answers3

19

A somewhat cheeky way could be to use loc:

In [11]: ds.loc['wiki': 'wikj']
Out[11]:
wikimedia     22
wikipedia     10
wikitravel    33
dtype: int64

This is essentially equivalent to ds[ds.index.map(lambda s: s.startswith('wiki'))].

To do contains, as @DSM suggests, it's probably nicer to write as:

ds[['wiki' in s for s in ds.index]]
Andy Hayden
  • 359,921
  • 101
  • 625
  • 535
  • Heh, nice trick, that's true, but it only does 'startwith' operation. Not 'contains'. Or am I wrong? – ronszon May 17 '13 at 20:56
  • 4
    There's `ds.irow(Series(ds.index).str.contains("wiki"))`, but I think I prefer a simple `ds[['wiki' in x for x in ds.index]]`. BTW, I think there are some bugs lurking here: `list(ds.str)` seems to go on forever. – DSM May 17 '13 at 20:59
  • @DSM thought I'd pop it as [an issue](https://github.com/pydata/pandas/issues/3638) anyways. – Andy Hayden May 17 '13 at 21:27
  • @DSM ...[so come 11.1](https://github.com/pydata/pandas/pull/3645) `list(ds.str)` won't crash. Thanks! :) – Andy Hayden May 20 '13 at 11:08
  • plus one for @DSM ds[['wiki' in s for s in ds.index]] 'feels' the best/most flexible. – Brian Wylie Dec 07 '15 at 15:56
18

An alternative (to Andy Hayden's answer) using filter, see here:

>>> ds.filter(like='wiki', axis=0)
wikimedia     22
wikipedia     10
wikitravel    33
dtype: int64
s_pike
  • 1,710
  • 1
  • 10
  • 22
Chris
  • 1,287
  • 12
  • 31
4

From the original question:

"...index cries for 'contains' just like what the columns have".

I'm not sure when this was added (this is an old question), but you can now use contains on index.str assuming your index is a string type:

>>> import pandas as pd
>>>
>>> ds = pd.Series({'wikipedia':10,'wikimedia':22,'wikitravel':33,'google':40})
>>> ds[ds.index.str.contains("wiki")]

wikipedia     10
wikimedia     22
wikitravel    33
dtype: int64
s_pike
  • 1,710
  • 1
  • 10
  • 22