I am using t-SNE to make a 2D projection for visualization from a higher dimensional dataset (in this case 30-dims) and I have a question about the perplexity hyperparameter.
It's been a while since I used t-SNE and had previously only used it on smaller datasets <1000 data points, where the advised perplexity of 5-50 (van der Maaten and Hinton) was sufficient to display the underlying data structure.
Currently, I am working with a dataset with 340,000 data points and feel that as the perplexity influences the local vs non-local representation of the data, more data points would require a perplexity much higher than 50 (especially if the data is not highly segregated in the higher dimensional space).
Does anyone have any experience with setting the optimal perplexity on datasets with a larger number of data points (>100k)?
I would be really interested to hear your experiences and which methods you go about using to determine the optimal perplexity (or optimal perplexity range).
An interesting article suggests that the optimal perplexity follows a simple power law (~N^0.5), would be interested to know what others think about that?
Thanks for your help