In machine learning and statistics, dimensionality reduction or dimension reduction is the process of reducing the number of random variables under consideration, and can be divided into feature selection and feature extraction.
Questions tagged [dimensionality-reduction]
422 questions
25
votes
5 answers
Plot PCA loadings and loading in biplot in sklearn (like R's autoplot)
I saw this tutorial in R w/ autoplot. They plotted the loadings and loading labels:
autoplot(prcomp(df), data = iris, colour = 'Species',
loadings = TRUE, loadings.colour = 'blue',
loadings.label = TRUE, loadings.label.size =…

O.rka
- 29,847
- 68
- 194
- 309
21
votes
6 answers
what is dimensionality in word embeddings?
I want to understand what is meant by "dimensionality" in word embeddings.
When I embed a word in the form of a matrix for NLP tasks, what role does dimensionality play? Is there a visual example which can help me understand this concept?

manoveg
- 423
- 1
- 3
- 13
21
votes
6 answers
How to efficiently find k-nearest neighbours in high-dimensional data?
So I have about 16,000 75-dimensional data points, and for each point I want to find its k nearest neighbours (using euclidean distance, currently k=2 if this makes it easiser)
My first thought was to use a kd-tree for this, but as it turns out they…

Benno
- 5,288
- 5
- 42
- 60
17
votes
2 answers
Does attention make sense for Autoencoders?
I am struggling with the concept of attention in the the context of autoencoders. I believe I understand the usage of attention with regards to seq2seq translation - after training the combined encoder and decoder, we can use both encoder and…

user3641187
- 405
- 5
- 10
16
votes
4 answers
Choosing subset of farthest points in given set of points
Imagine you are given set S of n points in 3 dimensions. Distance between any 2 points is simple Euclidean distance. You want to chose subset Q of k points from this set such that they are farthest from each other. In other words there is no other…

Shital Shah
- 63,284
- 17
- 238
- 185
16
votes
1 answer
How to compare predictive power of PCA and NMF
I would like to compare the output of an algorithm with different preprocessed data: NMF and PCA.
In order to get somehow a comparable result, instead of choosing just the same number of components for each PCA and NMF, I would like to pick the…

Phil D
- 161
- 1
- 5
13
votes
5 answers
Supervised Dimensionality Reduction for Text Data in scikit-learn
I'm trying to use scikit-learn to do some machine learning on natural language data. I've got my corpus transformed into bag-of-words vectors (which take the form of a sparse CSR matrix) and I'm wondering if there's a supervised dimensionality…

follyroof
- 3,430
- 2
- 28
- 26
11
votes
4 answers
t-SNE predictions in R
Goal: I aim to use t-SNE (t-distributed Stochastic Neighbor Embedding) in R for dimensionality reduction of my training data (with N observations and K variables, where K>>N) and subsequently aim to come up with the t-SNE representation for my test…

DAW
- 251
- 3
- 8
11
votes
2 answers
LDA ignoring n_components?
When I am trying to work with LDA from Scikit-Learn, it keeps only giving me one component, even though I am asking for more:
>>> from sklearn.lda import LDA
>>> x = np.random.randn(5,5)
>>> y = [True, False, True, False, True]
>>> for i in…

Andrew Latham
- 5,982
- 14
- 47
- 87
10
votes
1 answer
Parallel version of t-SNE
Is there any Python library with parallel version of t-SNE algorithm?
Or does the multicore/parallel t-SNE algorithm exist?
I'm trying to reduce dimension (300d -> 2d) of all word2vecs in my vocabulary using t-SNE.
Problem: the size of vocabulary…

Anton Karazeev
- 604
- 7
- 13
8
votes
2 answers
PCA for dimensionality reduction before Random Forest
I am working on binary class random forest with approximately 4500 variables. Many of these variables are highly correlated and some of them are just quantiles of an original variable. I am not quite sure if it would be wise to apply PCA for…

Rita A. Singer
- 167
- 3
- 7
7
votes
4 answers
How to use the reduced data - the output of principal component analysis
I am finding it hard to link the theory with the implementation. I would appreciate help in knowing where my understanding is wrong.
Notations - matrix in bold capital and vectors in bold font small letter
is a dataset on observations, each of …

SKM
- 959
- 2
- 19
- 45
7
votes
2 answers
Rotation argument for scikit-learn's factor analysis
One of the hallmarks of factor analysis is that it allows for non-orthogonal latent variables.
In R for example this feature is accessible via the rotation parameter of factanal.
Is there any such provision for sklearn.decomposition.FactorAnalysis?…

TheChymera
- 17,004
- 14
- 56
- 86
7
votes
2 answers
scikit KernelPCA unstable results
I'm trying to use KernelPCA for reducing the dimensionality of a dataset to 2D (both for visualization purposes and for further data analysis).
I experimented computing KernelPCA using a RBF kernel at various values of Gamma, but the result is…

fferri
- 18,285
- 5
- 46
- 95
6
votes
2 answers
With a PyTorch LSTM, can I have a different hidden_size than input_size?
I have:
def __init__(self, feature_dim=15, hidden_size=5, num_layers=2):
super(BaselineModel, self).__init__()
self.num_layers = num_layers
self.hidden_size = hidden_size
self.lstm =…

Shamoon
- 41,293
- 91
- 306
- 570