10

Is there any Python library with parallel version of t-SNE algorithm? Or does the multicore/parallel t-SNE algorithm exist?

I'm trying to reduce dimension (300d -> 2d) of all word2vecs in my vocabulary using t-SNE.

Problem: the size of vocabulary is about 130000 and it takes too long to proceed t-SNE for them.

Anton Karazeev
  • 604
  • 7
  • 13

1 Answers1

9

Yes there is a parallel version of the barnes-hutt implementation of t-SNE. https://github.com/DmitryUlyanov/Multicore-TSNE

There is also now a new implementation of tSNE that uses a Fast-Fourier transform funciton to significantly speed up the convolution step. It also uses the ANNOY library to perform the nearest neighbours search, the default tree-based method is also there and both take advantage of parallel processing.

Original code is available here: https://github.com/KlugerLab/FIt-SNE

and an R package version here: https://github.com/JulianSpagnuolo/FIt-SNE

JulianS
  • 188
  • 7
  • This implementation will likely disappoint anyone looking to parallize the bulk of tsne. The tsne algorithm has a few steps. One of the first steps is to compute nearest neighbors--this generally doesn't take very long and can be parallelized. The implementation pointed to here parallelizes that nearest neighbor calculation. The latter step is gradient descent--this is usually where most computation takes place. This isn't easily parallelized, and runs in serial in the implementation pointed to in this answer. – conradlee Aug 15 '17 at 13:42
  • 1
    I disagree up to a point - the benchmarks provided by the python implementation (linked above) and the R implementation (https://github.com/jkrijthe/Rtsne/issues/16#issuecomment-313717443) indicate a decent decrease in the amount of time taken to run the algorithm. Even if the speed up is only apparent in a certain step of tSNE, I will take that over no speed up. – JulianS Aug 23 '17 at 22:08
  • I can confirm the Multicore-TSNE implementation being significantly faster. Unfortunately this library only supports 3 (perplexity, n_iter, angle) parameters and ignores others without giving a warning. – Timomo Oct 17 '17 at 10:29