I am trying to build a content-based filtering system that classifies products by assigning them features that read like
{ salty: 0, sweet: 0.5, bitter: 0.7}
. Which means I give every product a n-dimensional vector, in this case
[0, 0.5, 0.7]
.
For a given product I would now like to find "similar" products by calculating the distance between the vectors. So for two products [0.2, 0.2, 0.8]
and [0.4, 0.9, 0.9]
the euclidean distance is roughly 0.78, which should be their 'score' (lower is better).
How do I do this with elasticsearch? Is elasticsearch the right tool for such a task?
Note that the real problem has a lot more than 3 dimensions.