2

I have a dataset which only contains one variable variable ("text") and a second dataset which is made up of a subset of this variable in dataset one and a new variable which is called "code".

dat1<-tibble(text=c("book","chair","banana","cherry"))
dat2<-tibble(text=c("banana","cherry"),code=c(1,NA))

What I would like to get at is a for loop that yields the value of "code" for every row (i) where dat1$text is the same as dat2$text and 0 otherwise. The ultimate goal is a vector c(0,0,1,NA) that I could then add back to the first dataset.

However, I don't know how to select the row corresponding to i in the for loop that would get me the value of "code" that I need to arrive at this vector. Also, even if I knew, how to extract these values, I'm not sure this whole thing would work, let alone maintain the order that I need (c(0,0,1,NA)).

for (i in dat2$text) {
  ifelse(i==dat1$text, print(dat[...,2]), print(0))
}

Does anyone know how to fix that?

Dr. Fabian Habersack
  • 1,111
  • 12
  • 30

2 Answers2

4

We can match text column of both the dataframe, replace the NA match as 0 or corresponding code value.

inds <- match(dat1$text, dat2$text)
dat1$out <- ifelse(is.na(inds), 0, dat2$code[inds])

dat1
# A tibble: 4 x 2
#  text     out
#  <chr>   <dbl>
#1 book       0
#2 chair      0
#3 banana     1
#4 cherry    NA
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
1

We can do a join

library(dplyr)
dat2 %>% 
   mutate(code = replace_na(code, 0)) %>% 
   right_join(dat1)
akrun
  • 874,273
  • 37
  • 540
  • 662