I have an existing dataframe to which I would like to add updated observations. I can identify these updated observations by an ID and a time point variable. I've tried removing the outdated observations from the existing dataframe and then tried using the merge() function to merge with a dataframe with just the updated observations, but I get duplicated columns. Is there an elegant way to do this (particularly using dplyr?)
Here's an example of what I'd like to do: Let's say I have a df, called practice
practice
ID Time score 1 score 2
1 hour 1 3 7
1 hour 2 4 2
2 hour 1 3 4
Let's say I want to change the score 1 variable for third observation (for which ID==2 and Time=="hour 1"), from 3 to 5.
What I've tried is making a new dataframe, called practice1:
ID Time score 1 score 2
1 hour 1 3 7
1 hour 2 4 2
Which removes the third observation, and then creating another new dataframe with the corrected observations, called practice2:
ID Time score 1 score 2
2 hour 1 3 4
I've then tried to do something like this:
Practice3 <- merge(practice2, practice1, by = "ID", all = T)
However, I'll get duplicate columns, and when I try to include multiple variables in the by= statement in the merge function, I get this error:
Error in fix.by(by.x, x) : 'by' must specify a uniquely valid column
Which may be due to the longitudinal nature of the data?
Thanks