R: Apply function on specific columns preserving the rest of the dataframe

Question

I'd like to learn how to apply functions on specific columns of my dataframe without "excluding" the other columns from my df. For example i'd like to multiply some specific columns by 1000 and leave the other ones as they are.

Using the sapply function for example like this:

    a<-as.data.frame(sapply(table.xy[,1], function(x){x*1000}))

I get new dataframes with the first column multiplied by 1000 but without the other columns that I didn't use in the operation. So my attempt was to do it like this:

    a<-as.data.frame(sapply(table.xy, function(x) if (colnames=="columnA") {x/1000} else {x}))

but this one didn't work.

My workaround was to give both dataframes another row with IDs and later on merge the old dataframe with the newly created to get a complete one. But I think there must be a better solution. Isn't it?

score 7 · Accepted Answer · answered Nov 15 '12 at 13:07

7

If you only want to do a computation on one or a few columns you can use transform or simply do index it manually:

# with transfrom:
df <- data.frame(A = 1:10, B = 1:10)
df <- transform(df, A = A*1000)

# Manually:
df <- data.frame(A = 1:10, B = 1:10)
df$A <- df$A * 1000

answered Nov 15 '12 at 13:07

Sacha Epskamp

46,463
20
113
131

how do I do this if I have a lot of columns (n=30)? typing all the names would be too much work... – Joschi Nov 20 '12 at 09:49
yes, actually I was always did my calculations on data frames like this: `a<-as.data.frame(sapply(df[,2:42], function(x){x*1000}))` but then the first column from my dataframe df is not within the newly created dataframe (a)... so I have to make a Workaround and merge the first column of the old dataframe to the new one. So this is okay but I thought there might be a easier way... – Joschi Nov 20 '12 at 10:17
I tried just the last option and worked smoothly. Thanks – Juano Mar 13 '22 at 17:03

score 3 · Answer 2 · answered Aug 29 '16 at 18:12

3

The following code will apply the desired function to the only the columns you specify. I'll create a simple data frame as a reproducible example.

(df <- data.frame(x = 1, y = 1:10, z=11:20))
(df <- cbind(df[1], apply(df[2:3],2, function(x){x*1000})))

Basically, use cbind() to select the columns you don't want the function to run on, then use apply() with desired functions on the target columns.

answered Aug 29 '16 at 18:12

derelict

3,657
3
24
29

Sorry for the comment, but thanks! Was looking for something like this and the ``cbind()`` worked perfectly. – Gainz Jul 24 '19 at 13:57

score 3 · Answer 3 · answered Aug 16 '19 at 19:26

In dplyr we would use mutate_at in which you can select or exclude (by preceding variable name with "-" minus sign) specific variables. You can just name a function

df <- df %>% mutate_at(vars(columnA), scale)

or create your own

df <- df %>% mutate_at(vars(columnA, columnC), function(x) {do this})

R: Apply function on specific columns preserving the rest of the dataframe

3 Answers3

Linked