I would like to know how to append new data (rows) from one data frame, df2, to an existing data frame, df1, based on a unique value in each table. So I have an existing data frame, df1, that has historical data and each row has a unique value. I then pull data from the web and put it into a new data frame, df2. The new data frame also includes a unique value which may or may not match a unique value in df1.
I would like to take all rows in df2 that have a unique value that does not exist in df1, and append those rows to df1. My initial thoughts were to use code similar to this:
ifelse(any(df1$unique_val==df2$unique_val), df1 <- df1, df1 <- rbind(df2, df1))
But then I realized that I need a more one-to-one match than an "any" match. I know how I would do this in SQL with a UNION and WHERE clause, but I'm not sure how to make it work in R. The only related items I could find researching were appending all data from two data frames or appending a new column to an existing data frame.
The following example shows what I am looking for and why I am not looking to "join" these two data frames"
df1 = data.frame(numb = c(1:6), rand = c(rep("Toaster",6)))
df1$unique_val <- paste0(df1$numb, df1$rand)
> df1
numb rand unique_val
1 1 Toaster 1Toaster
2 2 Toaster 2Toaster
3 3 Toaster 3Toaster
4 4 Toaster 4Toaster
5 5 Toaster 5Toaster
6 6 Toaster 6Toaster
df2 = data.frame(numb = c(5:7), rand = c(rep("Toaster",2), c(rep("Radio",1))))
df2$unique_val <- paste0(df2$numb, df2$rand)
> df2
numb rand unique_val
1 5 Toaster 5Toaster
2 6 Toaster 6Toaster
3 7 Radio 7Radio
As you can see, row 3 in df2 is the only new row (a row that does not have a matching unique_val in df1). I would like to add this new row to df1. Note: it's not always the same row that is new in df2.
I used each of the joins from this post, merge/join data frames as follows:
merge(df1,df2, by = "unique_val")
merge(df1,df2, by = "unique_val", all = TRUE)
merge(df1,df2, by = "unique_val", all.x = TRUE)
merge(df1,df2, by = "unique_val", all.y = TRUE)
I also tried the anti_join from dplyr:
anti_join(df1,df2, by = "unique_val")
Rbind gives me the following:
rbind(df1,df2)
numb rand conc
1 1 Toaster 1Toaster
2 2 Toaster 2Toaster
3 3 Toaster 3Toaster
4 4 Toaster 4Toaster
5 5 Toaster 5Toaster
6 6 Toaster 6Toaster
7 5 Toaster 5Toaster
8 6 Toaster 6Toaster
9 7 Radio 7Radio
None of which give me the desired output of the following:
numb rand conc
1 1 Toaster 1Toaster
2 2 Toaster 2Toaster
3 3 Toaster 3Toaster
4 4 Toaster 4Toaster
5 5 Toaster 5Toaster
6 6 Toaster 6Toaster
7 7 Radio 7Radio
I'm looking to rbind these data frames, not join them.