0

I'm trying to match values between two data frames based on one column, and then assign a corresponding value to one of the data frames. Here is what the data look like

head(test1)
    zip dat.citystate
1   80914              
2   32920              
3   80914              
4   80914              
5   80914              
6   80914                

head(test2)
    zip          citystate
1 35004          Moody, AL
2 35005     Adamsville, AL
3 35006          Adger, AL
4 35007      Alabaster, AL
5 35010 Alexander City, AL
6 35011 Alexander City, AL

The dat.citystate column in test1 is an empty string "".

I want to loop through test1 and find the city-state value from test2 that shares the same zip code, and add that to the corresponding row for test1. Here is my for loop:

for (i in  1:nrow(test1)){
  test1$dat.citystate[i] <- test2$citystate[test2$zip == test1$zip[i]])
}

However, I get the following error code:

Error: replacement has length zero

I've looked everywhere but can't figure out what this means or where the error is coming from. Any suggestions?

gh39
  • 25
  • 4
  • I think `merge` or a `dplyr` join is a better solution than a loop. Have a look at `?merge` and try it on your data. – neilfws Dec 02 '20 at 22:19
  • Can you edit your question so people can copy/paste the date to make it easier to write a solution? – Mario Niepel Dec 02 '20 at 22:26
  • `1:nrow(test1)` will fail you when you inadvertently have zero rows: the logical presumption is that the `for` loop will not run, but unfortunately `1:0` is length 2, not length 0, so it will run, and it will error. A safer alternative is `seq_len(nrow(test1))`. – r2evans Dec 03 '20 at 01:18
  • Isn't this a `merge` operation? I'd think that `merge(test1, test2, by="zip", all.x=TRUE)` would answer this. (See https://stackoverflow.com/q/1299871/3358272 and https://stackoverflow.com/a/6188334/3358272 for merge/join logic.) – r2evans Dec 03 '20 at 01:20
  • I will add data next time, sorry! Thanks for the tip on ```seq_len(nrow(test1))``` also. I tried merge but got a blank column instead. The solution posted below seems to have worked. – gh39 Dec 03 '20 at 14:48

1 Answers1

0

Base R solution (in your sample data you currently have no matches):

test1$dat.citystate <- with(test2, citystate[match(test1$zip, zip)])

Data:

test1 <- structure(list(zip = c(80914L, 32920L, 80914L, 80914L, 80914L, 
80914L), dat.citystate = c("Moody, AL", NA, "Moody, AL", "Moody, AL", 
"Moody, AL", "Moody, AL")), row.names = c(NA, -6L), class = "data.frame")

test2 <- structure(list(`zip citystate` = c("80914 Moody, AL", "35005 Adamsville, AL", 
"35006 Adger, AL", "35007 Alabaster, AL", "35010 Alexander City, AL", 
"35011 Alexander City, AL"), zip = c(80914, 35005, 35006, 35007, 
35010, 35011), citystate = c("Moody, AL", "Adamsville, AL", "Adger, AL", 
"Alabaster, AL", "Alexander City, AL", "Alexander City, AL")), row.names = c(NA, 
-6L), class = "data.frame")
hello_friend
  • 5,682
  • 1
  • 11
  • 15