27

Complicated title but here is a simple example of what I am trying to achieve:

d <- data.frame(v1 = c(1,2,3,4,5,6,7,8), 
                v2 = c("A","E","C","B","B","C","A","E"))

m <- data.frame(v3 = c("D","E","A","C","D","B"), 
                v4 = c("d","e","a","c","d","b"))

Values in d$v2 should be replaced by values in m$v4 by matching the values from d$v2 in m$v3

The resulting data frame d should look like:

v1    v4
1      a
2      e
3      c
4      b
5      b
6      c
7      a
8      e

I tried different stuff and the closest I came was: d$v2 <- m$v4[which(m$v3 %in% d$v2)]

I try to avoid any for-loops again! Must be possible :-) somehow... ;)

user969113
  • 2,349
  • 10
  • 44
  • 51

3 Answers3

20

You could try:

merge(d,m, by.x="v2", by.y="v3")
  v2 v1 v4
1  A  1  a
2  A  7  a
3  B  4  b
4  B  5  b
5  C  3  c
6  C  6  c
7  E  2  e
8  E  8  e

Edit

Here is another approach, to preserve the order:

data.frame(v1=d$v1, v4=m[match(d$v2, m$v3), 2])
  v1 v4
1  1  a
2  2  e
3  3  c
4  4  b
5  5  b
6  6  c
7  7  a
8  8  e
johannes
  • 14,043
  • 5
  • 40
  • 51
  • 1
    I tried merge but it changes the order and I want to keep the order of data frame d. Could of course add a column "order" to d first and after applying merge resort on that column but then I need to drop it again. It's a bit over the top I guess, isn't it? – user969113 Jul 17 '12 at 20:25
  • 1
    yeah match does the job! fantastic. the rest is easy to understand. there are so many of these things such as `match` `which` `%in%` `is.element` etc and also so many combinations that it's sometimes just difficult to find the right one.. umm :) – user969113 Jul 17 '12 at 20:39
  • The second solution by johannes worked, not quite what I wanted still better than merge. – tcratius Dec 05 '18 at 13:46
11

You could use a standard left join.

Loading the data:

d <- data.frame(v1 = c(1,2,3,4,5,6,7,8), v2 = c("A","E","C","B","B","C","A","E"), stringsAsFactors=F)
m <- data.frame(v3 = c("D","E","A","C","D","B"), v4 = c("d","e","a","c","d","b"), stringsAsFactors=F)

Changing column name, such that I can join by column "v2"

colnames(m) <- c("v2", "v4")

Left joining and maintaining the order of data.frame d

library(dplyr)
left_join(d, m)

Output:

  v1 v2 v4
1  1  A  a
2  2  E  e
3  3  C  c
4  4  B  b
5  5  B  b
6  6  C  c
7  7  A  a
8  8  E  e
Esben Eickhardt
  • 3,183
  • 2
  • 35
  • 56
11

This will give you the desired output:

d$v2 <- m$v4[match(d$v2, m$v3)]

match function returns the position from m matrix's v3 column for the values in d$v2 being matched. Once you have obtained the indices (from using match()), access elements from m$v4 using those indices to replace the elements in d matrix, column v2.

Nilesh Thakkar
  • 1,442
  • 4
  • 25
  • 36
Swati Gupta
  • 111
  • 1
  • 5