I am trying to write some code to determine if the letters in a small string are contained in a larger string in R. The Accuracy would then be returned by a percentage.
I found the following on StackOverflow (check if all characters of one string exist in another string in r), but the code provided calculates the average as the count of unique overlap divided by count of unique letters. i.e. it does not allow for repeated letters
s1 <- "ABBDEFGHIZ"
s2 <- "ABBDEFGHIJ"
compare <- function(s1, s2) {
c1 <- unique(strsplit(s1, "")[[1]])
c2 <- unique(strsplit(s2, "")[[1]])
length(intersect(c1,c2))/length(c1)
}
compare(s1,s2)
[1] 0.8888889
Ideally, the above code should return a value of 0.9, as 9/10 of the letters are matched instead of 8/9.
Any advice would be appreciated.