Select rows by partial string match in index

Question

Having a series like this:

ds = Series({'wikipedia':10,'wikimedia':22,'wikitravel':33,'google':40})

google        40
wikimedia     22
wikipedia     10
wikitravel    33
dtype: int64

I would like to select the rows where 'wiki' is a part of the index label (a partial string label).

For the moment I tried

ds[ds.index.map(lambda x: 'wiki' in x)]

wikimedia     22
wikipedia     10
wikitravel    33
Name: site, dtype: int64

and it does the job, but somehow the index cries for 'contains' just like what the columns have...

Any better way to do that?

Andy Hayden · Answer 1 · 2013-05-17T21:30:10.680

19

A somewhat cheeky way could be to use loc:

In [11]: ds.loc['wiki': 'wikj']
Out[11]:
wikimedia     22
wikipedia     10
wikitravel    33
dtype: int64

This is essentially equivalent to ds[ds.index.map(lambda s: s.startswith('wiki'))].

To do contains, as @DSM suggests, it's probably nicer to write as:

ds[['wiki' in s for s in ds.index]]

edited May 17 '13 at 21:30

answered May 17 '13 at 20:37

Andy Hayden

359,921
101
625
535

Heh, nice trick, that's true, but it only does 'startwith' operation. Not 'contains'. Or am I wrong? – ronszon May 17 '13 at 20:56
4

There's `ds.irow(Series(ds.index).str.contains("wiki"))`, but I think I prefer a simple `ds[['wiki' in x for x in ds.index]]`. BTW, I think there are some bugs lurking here: `list(ds.str)` seems to go on forever. – DSM May 17 '13 at 20:59
@DSM thought I'd pop it as [an issue](https://github.com/pydata/pandas/issues/3638) anyways. – Andy Hayden May 17 '13 at 21:27
@DSM ...[so come 11.1](https://github.com/pydata/pandas/pull/3645) `list(ds.str)` won't crash. Thanks! :) – Andy Hayden May 20 '13 at 11:08
plus one for @DSM ds[['wiki' in s for s in ds.index]] 'feels' the best/most flexible. – Brian Wylie Dec 07 '15 at 15:56

score 18 · Answer 2 · edited Apr 19 '22 at 12:28

18

An alternative (to Andy Hayden's answer) using filter, see here:

>>> ds.filter(like='wiki', axis=0)
wikimedia     22
wikipedia     10
wikitravel    33
dtype: int64

edited Apr 19 '22 at 12:28

s_pike

1,710
1
10
22

answered Sep 20 '17 at 08:48

Chris

1,287
12
31

score 4 · Answer 3 · answered Dec 03 '21 at 10:35

From the original question:

"...index cries for 'contains' just like what the columns have".

I'm not sure when this was added (this is an old question), but you can now use contains on index.str assuming your index is a string type:

>>> import pandas as pd
>>>
>>> ds = pd.Series({'wikipedia':10,'wikimedia':22,'wikitravel':33,'google':40})
>>> ds[ds.index.str.contains("wiki")]

wikipedia     10
wikimedia     22
wikitravel    33
dtype: int64

Select rows by partial string match in index

3 Answers3

Related