2

have lists, the first one (list1) has id,name,age and the other ones (list2,list3,..) has ids and test value (unique).

list 1:

id  age name    bio-test    
1   40  danny
2   16  nora            
3   35  james
4   21  ben

list 2 (bio-test):

id  test passed year   
1   100   yes   1
5   80    yes   n/a      
4   55    no    2

I am trying to add to list1 the test value to each id (not every id have a test value).

this is part of the code:

for (i in 1:length(list1)) { 
list1$test1value <- list2$test[match(list1$id[i], list2$id[i]),
nomatch = NA_integer_, incomparables = NULL)] }

but instead looking up the test value by id ,it copied just the first test value from list2 and copied it to 200 cells and the other 3000 are N/A.

what is wrong?

anat
  • 705
  • 2
  • 7
  • 20
  • changed it. thank you. – anat Dec 15 '16 at 18:11
  • Here is a [related question](http://stackoverflow.com/questions/41149718/overwriting-a-row-with-a-matched-id-value-in-the-same-dataframe/41150472#41150472) from yesterday. My answer uses `match` to fill in rows of missing values. Your problem is solved easier with `merge`. something like `merge(df1, df2, by="id", all=TRUE)`. – lmo Dec 15 '16 at 18:17
  • merge isn't good for my purposes, I don't want to create a different file and merge the two.. I just want to copie one value from each list2,list3.. to list1 that has the same id. – anat Dec 15 '16 at 18:26
  • copies will be made regardless and the `merge` method is straightforward to apply. You can reassign to the original data.frame: `df1 <- merge(df1, df2, by="id", all=TRUE)` for example. – lmo Dec 15 '16 at 18:29
  • if list2 has 4000 id's and list 1 has 5000 id's it will deleted the 1000 left. I need to add a test value to list1 if its exist and if not N/A. as I understand match is the closest similar to vlookup in excel. do you know what's wrong with my match code? – anat Dec 15 '16 at 18:33
  • `merge` will add the NA's if you have the `all=TRUE` argument. Please review the linked question before announcing that the function will not work. – Pierre L Dec 15 '16 at 19:07

1 Answers1

6

First you have typos in your example. Secondly, the assignment of 'list1$test1value' should have an '[i]' added to it to not save over each round. There should also not be an '[i]' added to list2$id since you want to search the entire vector for the lookup.

for (i in 1:length(list1)) { 
  list1$test1value[i] <- list2$test[match(list1$id[i], list2$id,
                             nomatch = NA_integer_, incomparables = NULL)] }

The code works, but there is no reason for any loops here. You are showing a lack of understanding in how R operates. The below code does the exact same thing much faster.

list1$test1value <- list2$test[match(list1$id, list2$id)]

R is built so that you do not have to hold its hand and instruct it how to go through each element of the vector. match will automatically iterate through each member one by one and look it up in the other vector for you. It will also assign the result in an orderly way in the dataset.

I will close this as a duplicate because as others suggested, merge is perfect for this.

merge(list1, list2[c("id", "test")], all.x=TRUE)
#  id age  name test
#1  1  40 danny  100
#2  2  16  nora   NA
#3  3  35 james   NA
#4  4  21   ben   55
Pierre L
  • 28,203
  • 6
  • 47
  • 69