0

I have two dataframes, one containing observations from what people bought in a grocery store and one containing a list of participants:

df1 <- data.frame(Person = sample(1:5, size=10, replace = T), Object = sample(letters[1:5], size=10, replace = T))

df2 <- data.frame(Participant = c(1, 3, 5))

Example:

df1:

Person Object
1 a
2 a
1 c
5 d
4 e
1 b
2 a
3 b
2 c
5 d

df2:

Person
1
3
5

I would like to create a subset of df1 containing only the observations where df1$Person == df2$Participant

Desired outcome:

df1.2:

Person Object
1 a
1 c
5 d
1 b
3 b
5 d

I tried using: participant <- df2$Participant df1.2 <- subset(df1, Person == participant)

And also

df1.2 <- df1 %>% filter(Person == df2$Participant)

Since they don't have the same length it doesn't work.

Any idea?

Matias V.
  • 51
  • 5
  • 2
    you should replace `filter(Person == df2$Participant)` by `filter(Person %in% df2$Participant)` – islem Dec 01 '22 at 11:22

1 Answers1

0
library(dplyr)

df1 %>% 
  inner_join(df2,by = c("Person" = "Participant"))
Vinícius Félix
  • 8,448
  • 6
  • 16
  • 32
  • It doesn't seem to work, I actually have more variables in both dataframes and it adds some variables of the 1st one to the 2nd one and gives me 0 observations – Matias V. Dec 01 '22 at 10:18
  • 1
    It should probably have been a `semi_join`. – harre Dec 01 '22 at 11:02