I've checked the docs (https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.eq.html)
I'm thinking something like below where I can use and re.I to ingnore case or use any other flag for that matter.
df.column.eq('Male').sum()
I've checked the docs (https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.eq.html)
I'm thinking something like below where I can use and re.I to ingnore case or use any other flag for that matter.
df.column.eq('Male').sum()
You can use the Series.str.contains
function with case=False
argument, ^Male$
as regex pattern and the regex=True
argument:
df['column'].str.contains('^Male$', case=False, regex=True).sum()
See the Series.str.contains
documentation.
Note that an alternative to setting case=False you can allow different case setting in words using a character set in the regex (ie. '^[Mm]ale$').
import pandas as pd
pupils = [1, 2, 3, 4, 5, 6, 7, 8]
test_outcomes =['pass', 'fail', 'pass', 'fail', 'not passed', 'fail', 'fail', 'Pass']
test_results = pd.DataFrame(zip(pupils, test_outcomes), columns['pupil','outcome'])
passes = test_results[test_results['outcome'].str.contains('^[Pp]ass', regex=True)]
pupil | outcome |
---|---|
1 | pass |
3 | pass |
8 | Pass |