-1

I want to summarize data and create dynamic columns columns and store in different data frame:

data is something like:

col1  col2  col3
A     1      200
B     1      300
A     2      400

k=c("A","B","C")
for(i in k)
  {
    group_data <- group_by(data[data$col1==i,], col2)
    summary_i<- summarize(group_data ,paste("var",k[i],sep="_") = n())
   }

Expected output:

Three data frame with name summary_A, summary_B, summary_C containing variable var_A, var_B and var_C respectively.

LyzandeR
  • 37,047
  • 12
  • 77
  • 87
  • please provide `data` and exact expected result – HubertL Jul 07 '17 at 19:33
  • 1
    It's likely that you don't *really* want this because it makes things very difficult to work with in R. Having a bunch of different, similarly named variables lying around isn't fun. Generally you are better off working with related collections in lists. Better ideas here: https://stackoverflow.com/questions/17499013/how-do-i-make-a-list-of-data-frames – MrFlick Jul 07 '17 at 19:34
  • Can't we use assign or paste command inside summarize? –  Jul 07 '17 at 19:52
  • 1
    Not on the left side of an equals sign. That's just not how R works.The character value `"col1"` is very different than the symbol `col1`. Those are not interchangeable. – MrFlick Jul 07 '17 at 21:41
  • Thanks for clarifying. –  Jul 07 '17 at 21:56

1 Answers1

0

As correctly pointed out by @MrFlick, there are better ways to manage your problem.
Anyway, here is a working version of your code:

data <- structure(list(col1 = structure(c(1L, 2L, 1L), .Label = c("A", 
"B"), class = "factor"), col2 = c(1L, 1L, 2L), col3 = c(200L, 
300L, 400L)), .Names = c("col1", "col2", "col3"), class = "data.frame", row.names = c(NA, 
-3L))

k=c("A","B","C")
for (i in seq_along(k)) {
  group_data <- group_by(data[data$col1==k[i],], col2)
  vark <- paste('var',i,sep='_')
  eval(parse(text=paste("summary_",i," <- summarize(group_data,", vark, " = n())",sep="")))
}

print(summary_1)
# A tibble: 2 x 2
#    col2 var_1
#   <int> <int>
# 1     1     1
# 2     2     1

print(summary_2)
# A tibble: 1 x 2
#    col2 var_2
#   <int> <int>
# 1     1     1

print(summary_3)
# A tibble: 0 x 2
# ... with 2 variables: col2 <int>, var_3 <int>
Marco Sandri
  • 23,289
  • 7
  • 54
  • 58