0

Can you please help me understand how I can do a create a separate column in a dataframe and store the value based on a COUNTIFS result.

I am having three columns currently in the dataframe Column 1 - Name of the Employee Column 2 - Name of the Manager (Can get repeated) Column 3 - Whether the employee is a Male / Female

I want to add a fourth column where I want to show the number of male employees corresponding to the Manager in Column 2.

I am still trying to learn how to append a table in Stackoverflow. So please excuse my lengthy query.

  • Hi and welcome on SO. It will be great if you can have a look at [ask] and then try to produce a [mcve]. – rpanai Jul 14 '22 at 16:18
  • I would not advise appending a table to your question. Create the python code to insert some sample data into a dataframe and post that code! This way, you have a reproducible example (i.e., if you only post a table, others will still have to create code to turn it into a dataframe, which only slows down and discourages answers). You might also post your initial attempts at creating these counts and adding the new column... – topsail Jul 14 '22 at 16:21
  • A good resource for pandas questions is: [How to make good pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) which runs through a quick way to create you input and desired output in a [mcve] format – G. Anderson Jul 14 '22 at 16:24
  • add your model dataframe as well as your expected result – Berlin Benilo Jul 14 '22 at 16:24
  • Does this answer your question? [Pandas create new column with count from groupby](https://stackoverflow.com/questions/29836477/pandas-create-new-column-with-count-from-groupby) – G. Anderson Jul 14 '22 at 16:26

1 Answers1

0

Does this produces the desired result?

import pandas as pd

d = {
    'emp_name': ['foo', 'bar', 'abc'],
    'man_name': ['spam', 'spam', 'eggs'],
    'gender': ['F', 'M', 'F']
}

df = pd.DataFrame(d)

print(df)
"""
  emp_name man_name gender
0      foo     spam      F
1      bar     spam      M
2      abc     eggs      F
"""

out = df.groupby('man_name')['gender'].apply(lambda x: (x=='M').sum()).reset_index(name='count_if')

print(out)

"""
  man_name  count_if
0     eggs         0
1     spam         1
"""
SunStorm44
  • 99
  • 6