How to count Occurrences in a cross table in R?

Question

How can I create a cross table in R (RStudio), where I count occurrences.

I have this sample input:

Technology <- c("A", "A", "B", "C", "C", "C")
Development <- c(1, 0, 1, 1, 1, 1)
Production <- c(1, 1, 0, 0, 0, 1)
Sales <- c(0, 0, 1, 1, 0, 1)
DF <- data.frame(Technology, Development, Production, Sales)

I want to know in which domain which technology is used most often.

The result should look like in the picture.

Rui Barradas · Accepted Answer · 2022-04-18T07:55:20.757

These problems are many times a data format problem and the solution is to reshape from wide to long format first, see this question.

Here is a base R solution with reshape and cross tabulation with xtabs.

Technology <- c("A", "A", "B", "C", "C", "C")
Development <- c(1, 0, 1, 1, 1, 1)
Production <- c(1, 1, 0, 0, 0, 1)
Sales <- c(0, 0, 1, 1, 0, 1)
DF <- data.frame(Technology, Development, Production, Sales)

reshape(
  DF,
  direction = "long",
  varying = list(names(DF[-1])),
  v.names = "Active",
  times = names(DF[-1]),
  timevar = "Phase"
) |>
  (\(x) xtabs(Active ~ Phase + Technology, x))()
#>              Technology
#> Phase         A B C
#>   Development 1 1 3
#>   Production  2 0 1
#>   Sales       0 1 2

^{Created on 2022-04-18 by the reprex package (v2.0.1)}

And a tidyverse solution.

suppressPackageStartupMessages({
  library(magrittr)
  library(tidyr)
})

DF %>%
  pivot_longer(-Technology) %>%
  xtabs(value ~ name + Technology, .)
#>              Technology
#> name          A B C
#>   Development 1 1 3
#>   Production  2 0 1
#>   Sales       0 1 2

^{Created on 2022-04-18 by the reprex package (v2.0.1)}

What do you think about the desired output? – TarJae Apr 18 '22 at 09:02 — TarJae, Apr 18 '22 at 09:02
@TarJae Your output is better. – Rui Barradas Apr 18 '22 at 10:54 — Rui Barradas, Apr 18 '22 at 10:54

score 1 · Answer 2 · answered Apr 18 '22 at 08:57

Here is a tidyverse approach, to get your desired output:

We group by Technology to summarise with across
then we prepare the rownames with paste and apply column_to_rownames from tibble
finally we could transform with t()

library(dplyr)
library(tibble)
DF %>% 
  group_by(Technology) %>% 
  summarise(across(c(Development, Production, Sales), sum)) %>% 
  mutate(Technology = paste("Technology", Technology, sep = " ")) %>% 
  column_to_rownames("Technology") %>% 
  t()

            Technology A Technology B Technology C
Development            1            1            3
Production             2            0            1
Sales                  0            1            2

That's a great solution. Reminds me of a SQL query :) This makes it intuitive for me as well. — Mec-Eng, Apr 18 '22 at 17:11

score 0 · Answer 3 · answered Jun 03 '22 at 14:42

Since you asked for a crosstable, you can also use the package crosstable for that:

library(crosstable)
crosstable(DF, by=Technology)%>% 
  as_flextable()

However, in your case, you don't care about proportions and you only need the numbers when each variable is 1, so you might want to run instead:

library(dplyr)
crosstable(DF, by=Technology, percent_pattern="{n}") %>% 
    filter(variable==1) %>% select(-variable) %>% 
    as_flextable()

More info about the package at https://danchaltiel.github.io/crosstable/.

How to count Occurrences in a cross table in R?

3 Answers3