How would I go about getting a table of values of for values grouped by a condition while also accounting for nulls?

Question

So I'm trying to work on a few dataframes that are similar to this:

employee ID	salary	regioncode	writeups	jobcode	fired
1	10000	11	1	23F	0
2	20000	20	2	12D	0
3	20000	10	0	45T	1
...	...	...	...	...	...
1000	30000	5	0	603	1

This original dataframe is further divided into 2 other dfs: one containing only employees with writeups, and the other containing employees only with either a specific jobcode, or who got fired.

And i want to form tables on them based on aggregrate counts of demographics, (in this case regioncode), for example:

regioncode	total employees	employees fired	employees written up
10	200	35	20
20	150	27	15

Below is some code resembling what I'm doing to achieve this, with some explanation to the code

regioncode_table is the table I want to return like the above one
employees_all is the master df that resembles the first table I posted
employees_fired is the divided df that only contains employees with a certain jobcode or who got fired
employees_writeups is the divided df containing only employees with writeups

regioncode_table = []
# appending regioncodes rownames to the table
regioncode_table.append(employees_all.regioncode.value_counts().sort_index().index.tolist())


#adding total employees from a region
regioncode_table.append(employees_all.regioncode.value_counts(dropna=False).sort_index().tolist())

#adding total employees who got fired or have a certain jobcode
regioncode_table.append(employees_fired.regioncode.value_counts(dropna=False).sort_index().tolist())

#adding total employees who were written up at least once
regioncode_table.append(employees_writeups.regioncode.value_counts(dropna=False).sort_index().tolist())

This on the surface gives me a table consistent to the one I'm looking for, but as soon as it encounters a region with no employees getting writeups or fired, the values get shifted up by 1. I would like to figure out a way for the the table to display 0 values from a specific region as well, such as if regioncode 42 has 3 employees and none of them have been written up or fired, the regioncode table still accounts for them and puts 0s in the table, such as:

regioncode	total employees	employees fired	employees written up
42	3	0	0
23	15	0	1

So far in addition to the method I have already tried to implement, I have tried using the groupby method for dataframes but I cannot figure a way to account for the zeroes as I have mentioned. If anyone has any other solutions please do let me know. Sorry for any formatting errors, or coding errors, I am still pretty much a beginner and this is my first post on stackoverflow. Thanks in advance!

`df.groupby('regioncode').agg({'employee ID': 'nunique', 'fired': 'sum', 'writeups': 'sum'})` — mozway, Oct 27 '22 at 05:04

How would I go about getting a table of values of for values grouped by a condition while also accounting for nulls?

0 Answers0