0

Please read the question completely and take care about that dplyr functions use non-standard evaluation (NSE). That makes use of spread or dcast beside dplyr commands inside a function somewhat problematic. My question is not about converting long to wide Format in normal case but inside a function including dplyr commands!

Suppose the following data frame df:

set.seed(1357)
df <- data.frame(
     "id" = seq(1:100),
     "grp" = sample(c("A","B","C","D"), 100, replace=TRUE, prob=c(0.25, 0.25, 0.30, 0.20)),
     "var1" = sample(c("X","Y"), 100, replace=TRUE, prob=c(0.45, 0.55)),
     "var2" = sample(c("X","Y"), 100, replace=TRUE, prob=c(0.15, 0.85)),
     "var3" = sample(c("X","Y"), 100, replace=TRUE, prob=c(0.50, 0.50)),
     "var4" = sample(c("X","Y"), 100, replace=TRUE, prob=c(0.40, 0.60)),
     "var5" = sample(c("X","Y"), 100, replace=TRUE, prob=c(0.10, 0.90)),
     "var6" = abs(rnorm(100)))

I want to provide a frequncy table for grp and all ather variable var1 to var5 via lapply. For this purpose I wrote a function named fnct.freq.

library(dplyr)
library(tidyr)
library(reshape2)

fnct.freq <- function(x){
     x <- enquo(x)
     freqtab <- df %>% 
          select(grp, !!x) %>%
          group_by(grp, !!x) %>%
          summarise(n = n()) %>% 
          mutate(freq = n / sum(n))
     freqtab$nfreq <- paste0(freqtab$n, " (", round(freqtab$freq,2), ")")
     freqtab <- freqtab[,-c(3,4)]
     # ???? here I want to convert long to wide format 
     # ???? some code like this:
     # dcast(freqtab, noquote(x) ~ grp, value.var="nfreq")  # dose not work
     return(freqtab)
}
fnct.freq(var2)

The providing result is:

> fnct.freq(var2)
# A tibble: 8 x 3
# Groups:   grp [4]
  grp   var2  nfreq   
  <fct> <fct> <chr>   
1 A     X     1(0.06) 
2 A     Y     16(0.94)
3 B     X     2(0.07) 
4 B     Y     26(0.93)
5 C     X     3(0.08) 
6 C     Y     33(0.92)
7 D     X     4(0.21) 
8 D     Y     15(0.79)

as you can see the result is a frequency table in a long format. But I want to have it in the wide format via packages like as rshape2 or tidyr. The result should be

  grp   X_nfreq    Y_nfreq
1   A  1 (0.06)  16 (0.94)
2   B  2 (0.07)  26 (0.93)
3   C  3 (0.08)  33 (0.92)
4   D  4 (0.21)  15 (0.79)

Later I want to apply the function on other variables of data frame

lapply(c("var1","var2","var3","var4","var5"), FUN=fnct.freq)
Fateta
  • 409
  • 2
  • 13
  • If your resulting (long-format) data frame is called `dd` then simply spread, i.e. `spread(dd, var2, nfreq)` – Sotos Sep 12 '19 at 13:44
  • 1
    `dcast(fnct.freq(var2), grp ~ var2, value.var="nfreq")` – maydin Sep 12 '19 at 13:46
  • So put it inside the function. Instead of `return(freqtab)` do `return(spread(freqtab, var2, nfreq))`... – Sotos Sep 13 '19 at 06:58

0 Answers0