I'm still working through the lessons on DataCamp for R, so please forgive me if this question seems naïve.
Consider the following (very contrived) sample:
library(dplyr)
library(tibble)
type <- c("Dog", "Cat", "Cat", "Cat")
name <- c("Ella", "Arrow", "Gabby", "Eddie")
pets = tibble(name, type)
name <- c("Ella", "Arrow", "Dog")
type <- c("Dog", "Cat", "Calvin")
favorites = tibble(name, type)
anti_join(favorites, pets, by = "name")
setdiff(favorites, pets, by = "name")
Both of these return exactly the same data:
> anti_join(favorites, pets, by = "name")
# A tibble: 1 × 2
name type
<chr> <chr>
1 Dog Calvin
> setdiff(favorites, pets, by = "name")
# A tibble: 1 × 2
name type
<chr> <chr>
1 Dog Calvin
The documentation for each of them seems to indicate only a subtle difference: that setdiff
returns rows, but anti_join
does not. From my testing, this doesn't appear to be the case.
Can someone explain to me the true differences between these two, and perhaps provide a better example that illustrates the differences more clearly? (This is an area where DataCamp hasn't been particularly helpful.)