I'm working with lists of spatial data for 20+ different sites (difficult to reproduce here; sorry in advance). I have three data frames associated with each site; each has a 'sample_ID' column and some other shared columns names.
What I'm trying to do seems very simple: if the 'sample_ID' values match for two data frames and the column names match, replace the value in DF 1 with that of DF 2 and DF 3 three. Example:
# DF 1:
SAMPLE_ID CLASS_ID CLASS VALUE
1 0 0 5
2 0 0 5
3 0 0 3
4 0 0 6
5 0 0 6
6 0 0 3
# DF 2
SAMPLE_ID REF_VAL CLASS_ID CLASS
1 33 2 cloud
2 45 3 water
3 NA 3 water
4 NA 4 forest
# DF 3
SAMPLE_ID CLASS_ID CLASS STRATA
5 3 NA 20
6 3 water 19
Desired output:
# DF 1:
SAMPLE_ID CLASS_ID CLASS VALUE
1 2 cloud 5
2 3 water 5
3 3 water 3
4 4 forest 6
5 3 NA 6
6 3 water 3
All I can think to do is some sort of match
indexing, like:
List1$CLASS_ID <- List2$CLASS_ID[match(List1$SAMPLE_ID, List2$SAMPLE_ID)
List1$CLASS_ID <- List3$CLASS_ID[match(List1$SAMPLE_ID, List3$SAMPLE_ID)
But this doesn't work; for one, it produces NAs in the nomatch
values (attempted a nested match
within the nomatch =
but that didn't work either), but more importantly I really need to streamline this by referencing all the matching column names rather than going one at a time since the actual data has 10+ columns that need replacement. Also important, I need the blank NA values to transfer over as well.
Any thoughts?