0

I have data like below

#    am     qsec        vs am     gear     carb
# 1:  1 17.36000 0.5384615  1 4.384615 2.923077
# 2:  1 17.02000 1.0000000  1 4.000000 2.000000
# 3:  0 18.18316 0.3684211  0 3.210526 2.736842
# 4:  0 17.82000 0.0000000  0 3.000000 3.000000

and I would like to produce

 #    variable          0          1
 # 1:     qsec 18.1831579 17.3600000
 # 2:     qsec 17.8200000 17.0200000
 # 3:       vs  0.3684211  0.5384615
 # 4:       vs  0.0000000  1.0000000
 # 5:       am  0.0000000  1.0000000
 # <snip>

where the am groups in the input data are used as columns in the output data.

I can do this through multiple steps (shown below in "data out") but I would like to be able to do this in a more data.tabley way. How can I reshape this data using data.table to produce the expected outcome please.

My attempt and data to reproduce

library(data.table)
data = setDT(mtcars[7:11])

# data in
tdat = data[, lapply(.SD, function(y){
                      unlist(lapply(c(mean, median), function(f) f(y) ))
                   }),
                  by="am", .SDcols=seq_along(data)
              ]


# data out  
m = melt(tdat, id.vars="am")
m[, r:=duplicated(interaction(am, variable))+0L]      
dcast(m, variable + r ~ am, value.var = "value")[, r:=NULL][]

I asked a similar question but using the solution by Akrun, given in the comments, returns

dcast( melt(tdat, id.var=1), variable~am, value.var='value')
#Aggregate function missing, defaulting to 'length'
#   variable 0 1
#1:     qsec 2 2
#2:       vs 2 2
#3:       am 2 2
#4:     gear 2 2
#5:     carb 2 2
user2957945
  • 2,353
  • 2
  • 21
  • 40

1 Answers1

2

This can be solved using data.table's rowid() function:

library(data.table)
m <- melt(tdat, id.vars="am")
dcast(m, variable + rowid(am) ~ am)[, am := NULL][]
    variable          0          1
 1:     qsec 18.1831600 17.3600000
 2:     qsec 17.8200000 17.0200000
 3:       vs  0.3684211  0.5384615
 4:       vs  0.0000000  1.0000000
 5:       am  0.0000000  1.0000000
 6:       am  0.0000000  1.0000000
 7:     gear  3.2105260  4.3846150
 8:     gear  3.0000000  4.0000000
 9:     carb  2.7368420  2.9230770
10:     carb  3.0000000  2.0000000

Data

library(data.table)
tdat <- fread(
"# i    am     qsec        vs am     gear     carb
# 1:  1 17.36000 0.5384615  1 4.384615 2.923077
# 2:  1 17.02000 1.0000000  1 4.000000 2.000000
# 3:  0 18.18316 0.3684211  0 3.210526 2.736842
# 4:  0 17.82000 0.0000000  0 3.000000 3.000000", 
  drop = 1:2, colClasses = list(integer = c(3, 6))
)

Alternatively, the sample dataset can be produced in a more concise way without doubling the am column:

setDT(mtcars[7:11])[, lapply(.SD, function(y) c(mean(y), median(y))), by = am]
   am     qsec        vs     gear     carb
1:  1 17.36000 0.5384615 4.384615 2.923077
2:  1 17.02000 1.0000000 4.000000 2.000000
3:  0 18.18316 0.3684211 3.210526 2.736842
4:  0 17.82000 0.0000000 3.000000 3.000000
Community
  • 1
  • 1
Uwe
  • 41,420
  • 11
  • 90
  • 134