2

I am trying to turn the following code, which works properly, into a function.

result_check <- data %>% 
  group_by(column, target)  %>%
  summarise(Unique_Elements = n()) %>%
  dcast(column_code ~ target, value.var="Unique_Elements")

For example, if we take the following dataset:

column1 target
  AA      YES
  BB      NO
  BC      NO
  AA      YES

The code would do the aggregate the dataset as per the target variable, like this:

column1    YES   NO
   AA       2    0
   BB       0    1
   BC       0    1  

This is how I construct the function:

aggregate_per_group <- function(column) {
data %>% 
  group_by(column, target)  %>%
  summarise(Unique_Elements = n()) %>%
  dcast(column ~ target, value.var="Unique_Elements")}

But I get - Error: unknown variable to group by : column. I know its a basic question, but any clues why I am loosing the argument in the group_by?

I have tried using the following imlementation "group_by_", as well as "require("dplyr")", but they seem unrelated.

Prometheus
  • 1,977
  • 3
  • 30
  • 57
  • 2
    Possible duplicate of [Easy way to convert long to wide format with counts in R](http://stackoverflow.com/questions/34417973/easy-way-to-convert-long-to-wide-format-with-counts-in-r) – Ronak Shah Dec 26 '16 at 10:34
  • 1
    Thanks for the feedback. Actually, as I explained... I have problem in the construction of the function. Not the reshaping per se. So I dont think its a duplicate of the question you pointed. – Prometheus Dec 26 '16 at 10:39

1 Answers1

2

We can use table from base R

table(data)

If we are interested in a function, then use the group_by_ along with spread from tidyr

aggregate_per_group <- function(column) {
     data %>% 
        group_by_(column, "target")  %>%
        summarise(Unique_Elements = n()) %>%
        spread(target, Unique_Elements, fill = 0)
 }

library(dplyr)
library(tidyr)
aggregate_per_group("column1")
#  column1    NO   YES
# *   <chr> <dbl> <dbl>
#1      AA     0     2
#2      BB     1     0
#3      BC     1     0

If we need the dcast from reshape2

library(reshape2)
aggregate_per_group <- function(column) {
    data %>% 
       group_by_(column, "target")  %>%
       summarise(Unique_Elements = n()) %>%
       dcast(data = ., paste(column,  '~ target'), 
              value.var="Unique_Elements", fill = 0)
 }

aggregate_per_group("column1")
#   column1 NO YES
#1      AA  0   2
#2      BB  1   0
#3      BC  1   0
akrun
  • 874,273
  • 37
  • 540
  • 662