I am looking to calculate the percent match for a string in R. For example:
x <- "asdf"
y <- "fdjk"
I would like this to return .5 (e.g. 2 matches, irrespective of order). Any thoughts are greatly appreciated.
I am looking to calculate the percent match for a string in R. For example:
x <- "asdf"
y <- "fdjk"
I would like this to return .5 (e.g. 2 matches, irrespective of order). Any thoughts are greatly appreciated.
You can split up a string into its specific characters with strsplit
:
char.x <- strsplit(x, "")[[1]]
char.x
# [1] "a" "s" "d" "f"
char.y <- strsplit(y, "")[[1]]
char.y
# [1] "f" "d" "j" "k"
Now, you can use intersect
and length
to compute your metric (the exact formula is not clear because your post didn't specify, for instance, how to handle duplicate characters):
length(intersect(char.x, char.y)) /
max(length(unique(char.x)), length(unique(char.y)))
# [1] 0.5