Count number of filtered values and create new columns

Question

I have a data frame called 'data' that contains multiple columns:

Grade	EMPID	PayBand
A	12345	15001-20000
c	64859	30001-35000
A	61245	20001-25000
D	75134	45001-50000
D	78451	40001-45000
C	31645	30001-35000
A	62513	20001-25000
D	91843	25001-30000
D	91648	35001-40000

I need R code to create a data frame that counts the number of each Grade within each PayBand that looks like this. E.g:

PayBand	A	C	D
15001-20000	1	0	0
20001-25000	2	0	0
25001-30000	0	0	1
30001-35000	0	2	0
35001-40000	0	0	1
40001-45000	0	0	1
45001-50000	0	0	1

I am unsure how to create the new dataframe and the new columns that are based off the first dataframe. Any help is much appreciated.

See this similar question from yesterday: [How to reshape long format data and count the values](https://stackoverflow.com/questions/74616896/how-to-reshape-long-format-data-and-count-the-values) — zephryl, Nov 30 '22 at 12:15

score 1 · Accepted Answer · answered Nov 30 '22 at 12:14

With table:

table(df[c(3, 1)])

Or with pivot_wider:

tidyr::pivot_wider(df, -EMPID, 
                   names_from = "Grade", values_from = "Grade", 
                   values_fn = length, values_fill = 0)

#   PayBand         A     C     D
# 1 15001-20000     1     0     0
# 2 30001-35000     0     2     0
# 3 20001-25000     2     0     0
# 4 45001-50000     0     0     1
# 5 40001-45000     0     0     1
# 6 25001-30000     0     0     1
# 7 35001-40000     0     0     1

Thanks so much. I have decided to use your first option but then need to convert it to a dataframe. Much appreciated. — atm1984, Nov 30 '22 at 13:15

score 1 · Answer 2 · answered Nov 30 '22 at 12:25

1

Good simple answer from Maël. Here are some related alternatives, including some options for conversion of the result to a data frame. atm1984, please make it easier next time with a reproducible example.

> df <- data.frame(Grade = c('A', 'C', 'A', 'D'),
+                  PayBand = c('15001-20000', '30001-35000', '20001-25000', '450
01-50000'))

> table(df$PayBand, df$Grade)

              A C D
  15001-20000 1 0 0
  20001-25000 1 0 0
  30001-35000 0 1 0
  45001-50000 0 0 1

> as.data.frame(table(df$PayBand, df$Grade))
          Var1 Var2 Freq
1  15001-20000    A    1
2  20001-25000    A    1
3  30001-35000    A    0
4  45001-50000    A    0
5  15001-20000    C    0
6  20001-25000    C    0
7  30001-35000    C    1
8  45001-50000    C    0
9  15001-20000    D    0
10 20001-25000    D    0
11 30001-35000    D    0
12 45001-50000    D    1

> as.data.frame.matrix(table(df$PayBand, df$Grade))
            A C D
15001-20000 1 0 0
20001-25000 1 0 0
30001-35000 0 1 0
45001-50000 0 0 1

answered Nov 30 '22 at 12:25

sashahafner

435
1
7

Hi, thank you for your options for converting to a dataframe, much appreciated. I'm fairly new to stackoverflow - what do you suggest I do next time to give a reproducible example? Thanks – atm1984 Nov 30 '22 at 13:17
You are welcome. Regarding a reproducible example, it was really just the data object that was missing from your question. If you already have a data frame like the one shown in your question you can use `dput()` to get something you can share with copy/paste. See here for some details and more information: – sashahafner Nov 30 '22 at 14:24
Regarding your original question, you might also check out the data.table package. – sashahafner Nov 30 '22 at 14:26
Great, I will try to use that dput() next time I ask a question. Thanks – atm1984 Nov 30 '22 at 14:29

Count number of filtered values and create new columns

2 Answers2