I have a keyword (e.g. 'green') and some text ("I do not like them Sam I Am!").
I'd like to see how many of the characters in the keyword ('g', 'r', 'e', 'e', 'n') occur in the text (in any order).
In this example the answer is 3 - the text doesn't have a G or R but has two Es and an N.
My problem arises where if a character in the text is matched with a character in the keyword, then it can't be used to match a different character in the keyword.
For example, if my keyword was 'greeen', the number of "matching characters" is still 3 (one N and two Es) because there are only two Es in the text, not 3 (to match the third E in the keyword).
How can I write this in R? This is just ticking something at the edge of my memory - I feel like it's a common problem but just worded differently (sort of like sampling with no replacement, but "matches with no replacement"?).
E.g.
keyword <- strsplit('greeen', '')[[1]]
text <- strsplit('idonotlikethemsamiam', '')[[1]]
# how many characters in keyword have matches in text,
# with no replacement?
# Attempt 1: sum(keyword %in% text)
# PROBLEM: returns 4 (all three Es match, but only two in text)
More examples of expected input/outputs (keyword, text, expected output):
- 'green', 'idonotlikethemsamiam', 3 (G, E, E)
- 'greeen', 'idonotlikethemsamiam', 3 (G, E, E)
- 'red', 'idonotlikethemsamiam', 2 (E and D)