Count number of rows when row contains certain text

Question

Probably a simple question but I could not find a simple answer. Let's for example take the following column Status within a dataframe df1:

**Status**
Planned
Unplanned
Missing
Corrected

I would like to count the rows when a cell contains, Planned and Missing. I tried the following:

test1 = df1['Status'].str.contains('Planned|Missing').value_counts()

The column Status is from the type: object. What's wrong with my line of code?

score 15 · Accepted Answer · answered Jul 23 '15 at 09:23

You can just filter the df with your boolean condition and then call len:

In [155]:
len(df[df['Status'].str.contains('Planned|Missing')])

Out[155]:
2

Or use the index True from your value_counts:

In [158]:   
df['Status'].str.contains('Planned|Missing').value_counts()[True]

Out[158]:
2

score 3 · Answer 2 · answered Jan 24 '19 at 11:38

pd.Series.str.contains when coupled with na=False guarantees you have a Boolean series. Note also True / False act like 1 / 0 with numeric computations. You can now use pd.Series.sum directly:

count = df['Status'].str.contains('Planned|Missing', na=False).sum()

This avoids unnecessary and expensive dataframe indexing operations.

score 0 · Answer 3 · edited Jan 24 '19 at 15:15

0

Give a try to the following one:

df["Status"].value_counts()[['Planned','Missing']].sum()

edited Jan 24 '19 at 15:15

Mark

5,994
5
42
55

answered Jan 24 '19 at 11:24

Scotty

1
2

1

Please, it is always beneficial to include some context explaining the source code – Carlos Cavero Jan 24 '19 at 11:48

Count number of rows when row contains certain text

3 Answers3

Linked