One large frequency table for multiple variables

Question

I have a dataframe with 4 nominal variables with 3 levels each (A, B, C). I want to make a frequency table of height 4 and width 3, where each row contains the count of the levels for that variable.

df <- data.frame(var1=c('B', 'A', 'C', 'A', 'B', 'B', 'C'),
                 var2=c('A', 'A', 'A', 'A', 'B', 'B', 'C'),
                 var3=c('A', 'A', 'A', 'A', 'B', 'B', 'C'),
                 var4=c('A', 'A', 'A', 'A', 'B', 'B', 'B'))

head(df,10)
    var1 var2 var3 var4
1    B    A    A    A
2    A    A    A    A
3    C    A    A    A
4    A    A    A    A
5    B    B    B    B
6    B    B    B    B
7    C    C    C    B

The result should be something like this:

      A B C
var1  2 3 2
var2  4 2 1
var3  4 2 1 
var4  4 3 0

Is there an easy way to do this?

score 7 · Answer 1 · answered Mar 17 '23 at 13:58

7

With stack:

t(table(stack(df)))

      values
ind    A B C
  var1 2 3 2
  var2 4 2 1
  var3 4 2 1
  var4 4 3 0

answered Mar 17 '23 at 13:58

Maël

45,206
3
29
67

score 5 · Accepted Answer · answered Mar 17 '23 at 13:56

5

With some help from this answer, we can combine outputs from the table command, transpose it, and make it as a data frame with row.names coming from the column names of df.

as.data.frame.matrix(t(table(unlist(df), row.names(df)[col(df)])), 
                     row.names = colnames(df))

     A B C
var1 2 3 2
var2 4 2 1
var3 4 2 1
var4 4 3 0

answered Mar 17 '23 at 13:56

benson23

16,369
9
19
38

1

Wow, that worked! I would never have come up with that myself... thanks a lot! – Erjen Mar 17 '23 at 14:07

score 1 · Answer 3 · answered Mar 17 '23 at 14:00

Using the tidyverse package, which is great at data manipulations like these:

library(tidyverse)

df %>% 
  pivot_longer(cols = starts_with("var"),
               names_to = "col",
               values_to = "val") %>% 
  mutate(count = 1) %>% 
  group_by(col) %>% 
  summarise(A = sum(count[val == "A"]),
            B = sum(count[val == "B"]),
            C = sum(count[val == "C"]))

Maël · Answer 4 · 2023-03-17T14:12:09.427

1

In tidyr:

library(tidyr)
df %>%
  pivot_longer(everything()) %>% 
  pivot_wider(names_from = "value", values_from = "value", 
              values_fn = length, values_fill = 0, names_sort = TRUE)

  name A B C
1 var1 2 3 2
2 var2 4 2 1
3 var3 4 2 1
4 var4 4 3 0

edited Mar 17 '23 at 14:12

answered Mar 17 '23 at 14:01

Maël

45,206
3
29
67

score 1 · Answer 5 · answered Mar 17 '23 at 16:11

1

Using tabyl

library(dplyr)
library(janitor)
library(tidyr)
pivot_longer(df, everything()) %>% 
  tabyl(name, value)

-output

 name A B C
 var1 2 3 2
 var2 4 2 1
 var3 4 2 1
 var4 4 3 0

answered Mar 17 '23 at 16:11

akrun

874,273
37
540
662

score 1 · Answer 6 · answered Mar 18 '23 at 08:12

1

We could use xtabs with stack:

xtabs(~ind+values, data=stack(df))

      values
ind    A B C
  var1 2 3 2
  var2 4 2 1
  var3 4 2 1
  var4 4 3 0

answered Mar 18 '23 at 08:12

TarJae

72,363
6
19
66

One large frequency table for multiple variables

6 Answers6