R - Bulk column value replacement

Question

In R I have a data.frame and I'd like to do a bulk update.

My table looks like

Col1  Col2  Col3
A      123   456
A      789   012
B      345   678
B      789   012

I want to scan over the table and replace A with "Apple" and B with "Banana"

In my case, the list of replacements is quite long (~30 items) so I have them both in lists like:

old<-c('A','B')
new<-c('Apple','Banana')

Thanks everyone. I ended up using mapvalues() form plyr which worked a tread. — Exie, Jul 12 '15 at 10:33

score 4 · Accepted Answer · answered Jul 07 '15 at 22:48

4

I like working with named vectors:

   df <- data.frame(Col1=c('A','A','B','B'), 
                    Col2=c(123,789,345,789), 
                    Col3=c(456,012,678,012))
oldv <- c('A','B')
newv <- c('Apple','Banana')
names(newv) <- oldv
df$Col1 <- newv[ df$Col1 ]

yields

  > df
    Col1 Col2 Col3
1  Apple  123  456
2  Apple  789   12
3 Banana  345  678
4 Banana  789   12

answered Jul 07 '15 at 22:48

Mike Wise

22,131
8
81
104

2

Yes, made it to 1800 :) – Mike Wise Jul 07 '15 at 22:49
I _knew_ I was forgetting something like that. It's even generalizable to multidimensional situations. – IRTFM Jul 07 '15 at 22:49
I have used this on like 4 questions in the last few weeks after I picked it up last month in the first chapter of Hadley's Advanced R book... I guess it is not that well known. – Mike Wise Jul 07 '15 at 22:51
2

I'm quite sure I have used it on SO before, and I know that I learned it on rhelp – IRTFM Jul 07 '15 at 22:51
Beware that `df$Col1` is a factor in this example - you need `newv[ as.character(df$Col1) ]` to be safe. E.g., try it with `levels(df$Col1) <- c("B","A")` and see how `newv[df$Col1]` and `newv[as.character(df$Col1)]` give differing results – thelatemail Jul 08 '15 at 00:37

Steven Beaupré · Answer 2 · 2015-07-07T23:07:50.860

You could also use lookup() from qdapTools:

ref <- data.frame(old = c("A", "B"),
                  new = c("Apple", "Banana"))

library(qdapTools)
df$Col1 <- lookup(df, ref)

Alternatively, you could use the %l% operator if you prefer the syntax:

df$Col1 <- df %l% ref

Or using base R:

df$Col1 <- ref$new[match(df$Col1, ref$old)]

Which gives:

#    Col1 Col2 Col3
#1  Apple  123  456
#2  Apple  789   12
#3 Banana  345  678
#4 Banana  789   12

score 1 · Answer 3 · answered Jul 07 '15 at 22:42

This feels a bit hamfisted since I try to avoid for-loops on aesthetic grounds:

dat$Col1 <- as.character(dat$Col1)
for ( i in seq_along(old) ) {dat$Col1 [ dat$Col1 == old[i] ] <- new[i] }

> dat
    Col1 Col2 Col3
1  Apple  123  456
2  Apple  789   12
3 Banana  345  678
4 Banana  789   12

R - Bulk column value replacement

3 Answers3