(This question is not answered at the linked Vlookup thread)
I'm looking for a way to replace values in one dataframe (DF2) with values from another (DF1), where DF2 contains duplicate entries, but I want to keep those duplicates.
As a made-up example:
Let's say I've got 2 dataframes. One, called DF1, contains correct numbers for hotel umbrellas across different dates.
We have line items for Hilton_A on 5/20, 5/25, 6/01, and associated umbrella #'s. Same with Hilton_B and Hilton_C.
Here's dput for DF1, the reference dataframe:
structure(list(Date = structure(c(15852, 15859, 15852, 15859,
15852, 15859, 15852), class = "Date"), Hotel = structure(c(1L,
1L, 2L, 2L, 3L, 3L, 4L), .Label = c("Hilton_A", "Hilton_B", "Hilton_C",
"Hilton_D"), class = "factor"), Umbrellas = c(9340L, 6401L, 9089L,
7716L, 5542L, 5565L, 8158L), datename = c("2013-05-27_Hilton_A",
"2013-06-03_Hilton_A", "2013-05-27_Hilton_B", "2013-06-03_Hilton_B",
"2013-05-27_Hilton_C", "2013-06-03_Hilton_C", "2013-05-27_Hilton_D"
)), .Names = c("Date", "Hotel", "Umbrellas", "datename"), row.names = c(NA,
-7L), class = "data.frame")
DF2 contains information for a bunch of other hotels on different dates, as well as information for the Hiltons in DF1. The problem is, the umbrella #'s in DF2 are wrong for the Hiltons, and I need to replace them with the #'s from DF1.
Here's the dput for DF2, with the incorrect Hilton numbers, along with some other data that I don't want to touch:
structure(list(Date = structure(c(15845, 15852, 15859, 15852,
15859, 15845, 15859, 15845, 15845, 15852, 15845, 15845, 15882
), class = "Date"), Hotel = structure(c(1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 3L, 4L, 5L, 6L, 7L), .Label = c("Hilton_A", "Hilton_B",
"Hilton_C", "Hilton_D", "RedRoof_A", "RedRoof_D", "Sheraton_D"
), class = "factor"), Umbrellas = c(263L, 287L, 258L, 110L, 234L,
212L, 265L, 542L, 81L, 51L, 162L, 232L, 493L), datename = c("2013-05-20_Hilton_A",
"2013-05-27_Hilton_A", "2013-06-03_Hilton_A", "2013-05-27_Hilton_A",
"2013-06-03_Hilton_A", "2013-05-20_Hilton_B", "2013-06-03_Hilton_B",
"2013-05-20_Hilton_B", "2013-05-20_Hilton_C", "2013-05-27_Hilton_D",
"2013-05-20_RedRoof_A", "2013-05-20_RedRoof_D", "2013-06-26_Sheraton_D"
)), .Names = c("Date", "Hotel", "Umbrellas", "datename"), row.names = c(NA,
-13L), class = "data.frame")
Normally this would work:
DF2$Umbrellas<- replace(DF2$Umbrellas, DF2$datename%in% DF1$datename, DF1$Umbrellas)
(where "datename" is just a concatenation of hotel and date, since the same hotel has information across multiple dates (so we can "unique=ify" the list))
But DF2 actually has multiple observations for each hotel and date that I want to keep (i.e., Hilton_A on 5/27 shows up 2 times in DF2).
So when I try to replace the Umbrella #'s from DF1 into DF2, I get the error message:
Warning message:
In replace(DF2$Umbrellas, DF2$hoteldatename %in% DF1$hoteldatename , :
number of items to replace is not a multiple of replacement length
And the numbers are all wrong.
Does anybody know what's happening here and how I might go about getting the number in DF1 to replace all of the applicable observations in DF2?