Operation on multiple(70) columns by another column in R

Question

For the following data, I want each column to be replaced by (divide by) --> /corresponding length. (i.e. A/len, B/len, C/len,...)

... implies more columns, upto 70. As this has many columns, how one should proceed ?

 A    B    C     D    E     F   ...   len

 2    4    5     7    8     8          5
 5    8    3     1    0     4          6
 8    9    3     9    6     2          12
 2    6    2     6    7     8          10
 1    2    4     2    9     5          20

possible duplicate of [How to apply same function to every specified column in a data.table](http://stackoverflow.com/questions/16846380/how-to-apply-same-function-to-every-specified-column-in-a-data-table) Seems pretty much the same to me...and I prefer the `set` answer, so... — Frank, Apr 06 '15 at 20:37

Rich Scriven · Accepted Answer · 2015-04-06T19:48:22.477

7

If your data frame df is exactly as you show, you can simply do

df[-ncol(df)] / df$len

If you have other columns to exclude, and you want them all included in the result, you can do something like

with(df, cbind(ID, df[!names(df) %in% c("ID", "len")]/len, len))
#   ID         A        B    C         D    E         F len
# 1  1 0.4000000 0.800000 1.00 1.4000000 1.60 1.6000000   5
# 2  2 0.8333333 1.333333 0.50 0.1666667 0.00 0.6666667   6
# 3  3 0.6666667 0.750000 0.25 0.7500000 0.50 0.1666667  12
# 4  4 0.2000000 0.600000 0.20 0.6000000 0.70 0.8000000  10
# 5  5 0.0500000 0.100000 0.20 0.1000000 0.45 0.2500000  20

Also, as suggested by David in the comments, you can use data.table

library(data.table)
x <- c(1L, ncol(df))
setDT(df)[, names(df)[-x] := lapply(.SD, "/", df$len), .SDcols = -x]

which results in

#    ID         A        B    C         D    E         F len
# 1:  1 0.4000000 0.800000 1.00 1.4000000 1.60 1.6000000   5
# 2:  2 0.8333333 1.333333 0.50 0.1666667 0.00 0.6666667   6
# 3:  3 0.6666667 0.750000 0.25 0.7500000 0.50 0.1666667  12
# 4:  4 0.2000000 0.600000 0.20 0.6000000 0.70 0.8000000  10
# 5:  5 0.0500000 0.100000 0.20 0.1000000 0.45 0.2500000  20

where df is

df <- read.table(text = "ID A    B    C     D    E     F   len
1  2    4    5     7    8     8    5
2  5    8    3     1    0     4    6
3  8    9    3     9    6     2   12
4  2    6    2     6    7     8   10
5  1    2    4     2    9     5   20", header = TRUE)

edited Apr 06 '15 at 19:48

answered Apr 06 '15 at 19:08

Rich Scriven

97,041
11
181
245

3

You can also maybe update by reference using `data.table` (though its a bit cumbersome) cause it was tagged with [tag:data.table] `setDT(df)[, names(df)[-ncol(df)] := lapply(.SD, function(x) x / df$len), .SDcols = -"len"]` – David Arenburg Apr 06 '15 at 19:11
And if I have a column "Id" before "A", and don't want divide operation on it, then ? – rach Apr 06 '15 at 19:14
Ohh..that's so simple and nice! – rach Apr 06 '15 at 19:20
no..wait, this just excludes the whole "Id" column from my table, whereas I want it there. – rach Apr 06 '15 at 19:33
@RachanaBagde Then you can use my middle chunk of code, the one starting with `with` – Rich Scriven Apr 06 '15 at 19:34
1

Richard, you can modify with `cbind` my proposal too, just like you did with `with`... – David Arenburg Apr 06 '15 at 19:42
Regarding David's suggestion, I was told to try `set` when I made a similar answer `for (j in 1:(ncol(df)-1)) set(j=j,value=df[[j]]/df$len)` http://stackoverflow.com/a/16846530/1191259 – Frank Apr 06 '15 at 20:05

Operation on multiple(70) columns by another column in R

1 Answers1