I have of dataframe of three-word and two-phrases and the counts of each phrase found in a text, respectively. Here is some dummy data:
trig <- c("took my dog", "took my cat", "took my hat", "ate my dinner", "ate my lunch")
trig_count <- c(3, 2, 1, 3, 1)
big <- c("took my", "took my", "took my", "ate my", "ate my")
big_count <- c(6,6,6,4,4)
df <- data.frame(trig, trig_count, big, big_count)
df$trig <- as.character(df$trig)
df$big <- as.character(df$big)
trig trig_count big big_count
1 took my dog 3 took my 6 2 took my cat
2 took my 6
3 took my hat 1 took my 6
4 ate my dinner 3 ate my 4
5 ate my lunch 1 ate my 4
I would like to write a function that takes as input any two-word phrase and returns the rows in the df if there is a match, and "no match" if there is no match.
I've tried variations of this:
match_test <- function(x){
ifelse(x %in% df$big==T, df[df$big==x,], "no match")
}
It works fine for a two-word phrase that isn't in the df, for instance:
match_test("looked for")
returns
"no match"
But for words that do have a match, it doesn't work, for instance:
match_test("took my")
returns
"took my dog" "took my cat" "took my hat"
When what I am looking for is this:
trig trig_count big big_count
1 took my dog 3 took my 6
2 took my cat 2 took my 6
3 took my hat 1 took my 6
What is it about %in% that I am not understanding? Or is it something else? Would really appreciate your guidance.