I have a vector of strings and I would like to hash each element individually to integers modulo n.
In this SO post it suggests an approach using digest
and strotoi
. But when I try it I get NA
as the returned value
library(digest)
strtoi(digest("cc", algo = "xxhash32"), 16L)
So the above approach will not work as it can not even produce an integer let alone modulo of one.
What's the best way to hash a large vector of strings to integers modulo n for some n? Efficient solutions are more than welcome as the vector is large.