1

I've checked the docs (https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.eq.html)

I'm thinking something like below where I can use and re.I to ingnore case or use any other flag for that matter.

df.column.eq('Male').sum()
Clay Campbell
  • 168
  • 13

2 Answers2

1

You can use the Series.str.contains function with case=False argument, ^Male$ as regex pattern and the regex=True argument:

df['column'].str.contains('^Male$', case=False, regex=True).sum()

See the Series.str.contains documentation.

Also, see What do ^ and $ mean in a regular expression?

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • So in df .contains, `regex=True` is the default, not necessary, that's what I was trying to find out, thanks for the documentation link – gseattle May 08 '22 at 00:14
1

Note that an alternative to setting case=False you can allow different case setting in words using a character set in the regex (ie. '^[Mm]ale$').

import pandas as pd
pupils = [1, 2, 3, 4, 5, 6, 7, 8]
test_outcomes =['pass', 'fail', 'pass', 'fail', 'not passed', 'fail', 'fail', 'Pass']
test_results = pd.DataFrame(zip(pupils, test_outcomes), columns['pupil','outcome'])
passes = test_results[test_results['outcome'].str.contains('^[Pp]ass', regex=True)]
pupil outcome
1 pass
3 pass
8 Pass
kaiinge
  • 21
  • 4