0

I have two data frames

columns df1= a, b, c, d

columns df2= e, f, c, d

I want to say that if df1$a == df2$e and at the same time df1$b == df2$f I want to print in a new column df1$x the values of the corresponding df2$c.

So I wrote this command df1$x <- ifelse(df1$a %in% df2$e & df1$b %in% df2$f, df2$c, ifelse (NULL))

But the values in df1x are not correct. I think I have to do a while loop, but I have no idea from where to start.

Nicoletta

  • 1
    Can you provide example input data and expected output? See [how to make a great R reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – IceCreamToucan Jan 24 '20 at 13:47
  • You can't put NULL as an element in a vector. You'll have to make it `NA`. That said, the code you provided has several typos (e.g., df$1 instead of df1$a), and there's no reason to use `%in%` - just use `==`. – iod Jan 24 '20 at 13:51

1 Answers1

1

To merge two data frames by multiple columns you can use merge and corresponding by.x and by.y. Both of those accept vector of column names. Parameter all.x means all data from first df1 will be preserved.

df1 <-  merge(df1, df2[,c("e","f","c")], by.x=c("a","b"), by.y=c("e","f"), all.x=T)

colnames(df1) <- c("a","b","c","d","x")

For more information see ?merge. For more complex operation you can check out package dplyr and its function left_join.

Result on random data

df1 <- data.frame(a = sample(1:5), b= sample(1:5), c=runif(5), d = runif(5))
df2 <- data.frame(e = df1$a, f = df1$b, c = runif(5), d = runif(5))
df2$e[3] <- 7

Output

> df1
  a b          c          d
1 5 4 0.76677063 0.92123552
2 4 1 0.93524320 0.09275425
3 3 2 0.01121468 0.12035981
4 1 5 0.72992427 0.87711572
5 2 3 0.11680937 0.93696597
> df2$e[3] <- 7
> df2
  e f         c         d
1 5 4 0.6251662 0.1549575
2 4 1 0.8464672 0.8702837
3 7 2 0.5394273 0.4290171
4 1 5 0.4061817 0.9072905
5 2 3 0.3376456 0.4291463
> merge(df1, df2[,c("e","f","c")], by.x=c("a","b"), by.y=c("e","f"), all.x=T)
  a b          c          d         x
1 1 5 0.72992427 0.87711572 0.4061817
2 2 3 0.11680937 0.93696597 0.3376456
3 3 2 0.01121468 0.12035981        NA
4 4 1 0.93524320 0.09275425 0.8464672
5 5 4 0.76677063 0.92123552 0.6251662
jyr
  • 690
  • 6
  • 20