How to count the data and mutate a new column for it in R

Question

For example my table is shown as below

Job	Gender
CEO	Male
Manager	Male
Manager	Female
Manager	Male
Supervisor	Female

Then I would like to organize it to something like below

Job	Male	Female
CEO	1	0
Manager	2	1
Supervisor	0	1

How can I make it right?

Welcome to SO. What have you tried so far and why didn't it work? — Limey, Aug 11 '22 at 09:34
what you want is called a cross-tabulation or contingency table. There are a lot of ways to do this in R, with `table`, `xtabs` etc. You should do some reading on these functions. It's also possible with dplyr etc but that seems like a long way around to get a simple thing. This site summarises quite well http://analyticswithr.com/contingencytables.html — George Savva, Aug 11 '22 at 09:35
Does this answer your question? [Count number of rows by group using dplyr](https://stackoverflow.com/questions/22767893/count-number-of-rows-by-group-using-dplyr) — Limey, Aug 11 '22 at 09:35

score 3 · Answer 1 · answered Aug 11 '22 at 09:44

Just pivot_wider() with values_fn = length:

library(tidyr)

df %>%
  pivot_wider(names_from = Gender, values_from = Gender, values_fn = length, values_fill = 0)

# # A tibble: 3 × 3
#   Job         Male Female
#   <chr>      <int>  <int>
# 1 CEO            1      0
# 2 Manager        2      1
# 3 Supervisor     0      1

score 1 · Accepted Answer · answered Aug 11 '22 at 09:34

1

You need to group_by the Job column, then count the Gender in each Job. After that, transform the dataframe into a "wide" format by expanding the count result.

library(tidyverse)

df %>% 
  group_by(Job) %>% 
  count(Gender) %>% 
  pivot_wider(names_from = Gender, values_from = n, values_fill = 0) %>% 
  ungroup()

# A tibble: 3 × 3
  Job         Male Female
  <chr>      <int>  <int>
1 CEO            1      0
2 Manager        2      1
3 Supervisor     0      1

Or more simply, a single table function.

table(df$Job, df$Gender)

             Female Male
  CEO             0    1
  Manager         1    2
  Supervisor      1    0

answered Aug 11 '22 at 09:34

benson23

16,369
9
19
38

Thank you~ It goes successful by using the second method. The first method keep showing **could not find function "pivot_wider"** even I included library **tidyverse** and **tidyr** – JPaul Aug 11 '22 at 09:53
@JPaul If you are sure you've installed `tidyr`, you can try `tidyr::pivot_wider()` – benson23 Aug 11 '22 at 09:54
@JPaul note that the second method returns a `table` object, not a dataframe. If you want a dataframe, go on with `as.data.frame.matrix(table(...))`. – Darren Tsai Aug 11 '22 at 09:58

Quinten · Answer 3 · 2022-08-11T09:47:07.983

0

Another option using group_by with count and spread like this:

library(dplyr)
library(tidyr)
df %>%
  group_by(Job, Gender) %>%
  count() %>%
  spread(Gender, n, fill = 0)
#> # A tibble: 3 × 3
#>   Job        Female  Male
#>   <chr>       <dbl> <dbl>
#> 1 CEO             0     1
#> 2 Manager         1     2
#> 3 Supervisor      1     0

^{Created on 2022-08-11 by the reprex package (v2.0.1)}

edited Aug 11 '22 at 09:47

answered Aug 11 '22 at 09:40

Quinten

35,235
5
20
53

score 0 · Answer 4 · answered Aug 11 '22 at 09:52

0

One possible way to solve your problem:

xtabs(~ Job + Gender, data=df)

            Gender
Job          Female Male
  CEO             0    1
  Manager         1    2
  Supervisor      1    0

answered Aug 11 '22 at 09:52

B. Christian Kamgang

5,098
5
10

How to count the data and mutate a new column for it in R

4 Answers4