Formatting colnames to be read by cbind

Question

Say I have a list called df such that colnames(df) yields:

"A"           "B"          "C"                 "D"           "E"                 "F"

I would like aggregate data in the following way:

aggregate(cbind(`C`,`D`,`E`,`F`)~A+B, data = df, FUN = sum)

Of course I could do it "manually" but in my true data I have a very big amount of columns, so I am trying to change the colnames(df)[3:6] output to yield:

`C`,`D`,`E`,`F`

instead. So far I have tried to use toString(colnames(df)[3:6]) which yields:

"C, D, E, F"

But this is not read properly by cbind.

Any suggestions?

Try `aggregate(cbind(C,D,E,F)~A+B, data = df, FUN = sum)` – ThomasIsCoding Feb 23 '21 at 17:14 — ThomasIsCoding, Feb 23 '21 at 17:14
Or `aggregate(. ~ A + B, data = df, FUN = sum)` – akrun Feb 23 '21 at 17:16 — akrun, Feb 23 '21 at 17:16

jay.sf · Accepted Answer · 2021-02-23T17:36:39.023

1

Instead of the cbind you could also use a matrix created from the subsetted data frame.

aggregate(as.matrix(df[names(df)[3:6]])~A+B, data=df, FUN=sum)
#       A     B     C     D     E     F
# 1  0.36 -0.11  2.02  2.29 -0.13 -2.66
# 2 -0.56  0.40 -0.09  1.30 -0.28 -0.28
# 3  1.37  0.63  1.51 -0.06 -1.39  0.64

Or, to answer your question literally try

(ev <- sprintf("cbind(%s)", toString(names(df)[3:6])))
# [1] "cbind(C, D, E, F)"

I don't think the backticks are needed. Are they?

And then, of course:

aggregate(eval(parse(text=ev))~A+B, data=df, FUN=sum)
#       A     B     C     D    E     F
# 1 -2.44 -1.78  1.90 -1.76 0.46 -0.61
# 2  1.32 -0.17 -0.43  0.46 0.70  0.50
# 3 -0.31  1.21 -0.26 -0.64 1.04 -1.72

Data:

df <- structure(list(A = c(-2.44, 1.32, -0.31), B = c(-1.78, -0.17, 
1.21), C = c(1.9, -0.43, -0.26), D = c(-1.76, 0.46, -0.64), E = c(0.46, 
0.7, 1.04), F = c(-0.61, 0.5, -1.72)), class = "data.frame", row.names = c(NA, 
-3L))

edited Feb 23 '21 at 17:36

answered Feb 23 '21 at 17:17

jay.sf

60,139
8
53
110

Thanks a lot for the answer! It seems like the backticks are indeed needed cause they are imported from the csv as part of the column name. Would there be an easy way to get them back, or just remove them from the colnames? – Weierstraß Ramirez Feb 23 '21 at 17:46
@WeierstraßRamirez Yeah I'd recommend to get rid of them using `names(df) <- gsub("\`", "", names(df))`. – jay.sf Feb 23 '21 at 17:55
@WeierstraßRamirez Actually I'm curious how your data looks like, could you create a small example as I did? Just use `dput` as it's described [here under "Copying original data"](https://stackoverflow.com/a/5963610/6574038), which is the usual way to share data on Stack Overflow. – jay.sf Feb 23 '21 at 18:01
So actually, I think the problem is that R adds the backticks since I have compound colnames in my data, this is from the actual colnames(df): [1] "CATEGORY" "PARENT" "BRAND" "MEDIA" "PROPERTY" "TOTAL DOLS" "TOTAL UNITS" [8] "2011 DOLS" "2011 UNITS" "2012 DOLS" "2012 UNITS" "2013 DOLS" "2013 UNITS" "2014 DOLS" [15] "2014 UNITS" – Weierstraß Ramirez Feb 23 '21 at 21:39

Formatting colnames to be read by cbind

1 Answers1