http://spark.apache.org/docs/latest/mllib-feature-extraction.html#word2vec
On the spark implementation of word2vec, when the number of iterations or data partitions are greater than one, for some reason, the cosine similarity is greater than 1.
In my knowledge, cosine similarity should always be about -1 < cos < 1. Does anyone know why?