1

Having two dataframes:

x <- data.frame(numbers=c('1','2','3','4','5','6','7','8','9'), coincidence="NA")

and

y <- data.frame(numbers=c('1','3','10'))

How can I check if the observations in y (1, 3 and 10) also exist in x and fill accordingly the column x["coincidence"] (for example with YES|NO, TRUE|FALSE...).

I would do the same in Excel with a formula combining IFERROR and VLOOKUP, but I don't know how to do the same with R.

Note: I am open to change data.frames to tables or use libraries. The dataframe with the numbers to check (y) will never have more than 10-20 observations, while the other one (x) will never have more than 1K observations. Therefore, I could also iterate with an if, if it´s necessary

Jaap
  • 81,064
  • 34
  • 182
  • 193
agustin
  • 1,311
  • 20
  • 42
  • 3
    Try `x$coincidence <- x$numbers %in% y$numbers` – Pierre L Jan 14 '16 at 17:00
  • see also the classic [merge / join Q&A](http://stackoverflow.com/questions/1299871/how-to-join-merge-data-frames-inner-outer-left-right) (especially when you want to add values from one dataframe to another) – Jaap Jan 14 '16 at 17:02

2 Answers2

4

We can create the vector matching the desired output with a set difference search that outputs boolean TRUE and FALSE values where appropriate. The sign %in%, is a binary operator that compares the values on the left-hand side to the set of values on the right:

x$coincidence <- x$numbers %in% y$numbers
# numbers coincidence
# 1       1        TRUE
# 2       2       FALSE
# 3       3        TRUE
# 4       4       FALSE
# 5       5       FALSE
# 6       6       FALSE
# 7       7       FALSE
# 8       8       FALSE
# 9       9       FALSE
Pierre L
  • 28,203
  • 6
  • 47
  • 69
0

Do numbers have to be factors, as you've set them up? (They're not numbers, but character.) If not, it's easy:

x <- data.frame(numbers=c('1','2','3','4','5','6','7','8','9'), coincidence="NA", stringsAsFactors=FALSE)
y <- data.frame(numbers=c('1','3','10'), stringsAsFactors=FALSE)

x$coincidence[x$numbers %in% y$numbers] <- TRUE


> x
  numbers coincidence
1       1        TRUE
2       2          NA
3       3        TRUE
4       4          NA
5       5          NA
6       6          NA
7       7          NA
8       8          NA
9       9          NA

If they need to be factors, then you'll need to either set common levels or use as.character().

Phiala
  • 141
  • 2