Questions tagged [nmf]

Non-negative matrix factorization (NMF or NNMF), also non-negative matrix approximation is a group of algorithms in multivariate analysis and linear algebra where a matrix V is factorized into (usually) two matrices W and H, with the property that all three matrices have no negative elements.

is a technique to approximate a matrix like V = WH. Here dimension of V,W,H can be respectively m*n, m*p, p*n where p << n usually. Now W can be thought as a weight matrix for hidden variables. As p can be very small this can also be viewed as a dimensionality reduction technique like .

is widely applicable in most real world cases where V can't have negative values like , , recommender system etc. General applications of include:

For this tag users should provide mathematical clarity as it is an advanced topic along with information about application to specific case.

Useful links:

77 questions
16
votes
1 answer

How to compare predictive power of PCA and NMF

I would like to compare the output of an algorithm with different preprocessed data: NMF and PCA. In order to get somehow a comparable result, instead of choosing just the same number of components for each PCA and NMF, I would like to pick the…
12
votes
1 answer

Using scikit-learn NMF with a precomputed set of basis vectors (Python)

I want to use scikit-learn NMF (from here) (or any other NMF if it does the job, actually). Specifically, I have an input matrix (which is an audio magnitude spectrogram), and I want to decompose it. I already have the W matrix pre-computed. How do…
pavlos163
  • 2,730
  • 4
  • 38
  • 82
8
votes
1 answer

Very Large and Very Sparse Non Negative Matrix factorization

I have a very large and also sparse matrix (531K x 315K), the number of total cells is ~167 Billion. The non-zero values are only 1s. Total number of non-zero values are around 45K. Is there an efficient NMF package to solve my problem? I know there…
mgokhanbakal
  • 1,679
  • 1
  • 20
  • 26
6
votes
1 answer

Reconstructing new data using sklearn NMF components Vs inverse_transform does not match

I fit a model using scikit-learn NMF model on my training data. Now I perform an inverse transform of new data using result_1 = model.inverse_transform(model.transform(new_data)) Then I compute the inverse transform of my data manually taking the…
swathis
  • 336
  • 5
  • 17
5
votes
1 answer

Numpy gives "TypeError: can't multiply sequence by non-int of type 'float'"

The problematic part is: self.H = np.multiply(self.H, np.divide(np.matmul(preprocessing.normalize(self.W).T, np.multiply(self.X, np.power(self.A, self.beta - 2)))), np.matmul(self.W.T, np.power(self.A, self.beta - 1)) + self.sparsity) A, W, H…
Skywalker
  • 582
  • 1
  • 5
  • 16
5
votes
1 answer

SKLearn NMF Vs Custom NMF

I am trying to build a recommendation system using Non-negative matrix factorization. Using scikit-learn NMF as the model, I fit my data, resulting in a certain loss(i.e., reconstruction error). Then I generate recommendation for new data using the…
swathis
  • 336
  • 5
  • 17
5
votes
1 answer

Is there good library to do nonnegative matrix factorization (NMF) fast?

I have a sparse matrix whose shape is 570000*3000. I tried nima to do NMF (using the default nmf method, and set max_iter to 65). However, I found nimfa very slow. Have anyone used a faster library to do NMF?
Hanfei Sun
  • 45,281
  • 39
  • 129
  • 237
4
votes
1 answer

NMF as a clustering method in Python Scikit

I am working on implementing a Python script for NMF text data clustering. In my work I am using Scikit NMF implementation, however as I understand, in Scikit NMF is more like classification method than a clustering method. I have developed a simple…
rafmat24
  • 83
  • 2
  • 8
4
votes
1 answer

probability distribution of topics using NMF

I use the following code to do the topic modeling on my documents: from sklearn.feature_extraction.text import TfidfVectorizer, CountVectorizer tfidf_vectorizer = TfidfVectorizer(tokenizer=tokenize, max_df=0.85, min_df=3, ngram_range=(1,5)) tfidf =…
4
votes
2 answers

Scikit-learn non-negative matrix factorization (NMF) for sparse matrix

I am using Scikit-learn's non-negative matrix factorization (NMF) to perform NMF on a sparse matrix where the zero entries are missing data. I was wondering if the Scikit-learn's NMF implementation views zero entries as 0 or missing data. Thank you!
Alex
  • 317
  • 3
  • 9
3
votes
3 answers

How to use sklearn's Matrix factorization to predict new users' recommendation scores

I'm trying to use sklearn.decomposition.NMF to a matrix R that contains data on how users rated items to predict user ratings for items that they have not yet seen. the matrix's rows being users, columns being items, and values being scores, with 0…
3
votes
1 answer

Fast NMF in R on sparse matrices

I'm looking for a fast NMF implementation for sparse matrices in R. The R NMF package consists of a number of algorithms, none of which impress in terms of computational time. NNLM::nnmf() seems state of the art in R at the moment, specifically the…
zdebruine
  • 3,687
  • 6
  • 31
  • 50
3
votes
0 answers

Rcpp port of a linear algebra function in R

The following is an objective function for symmetrical non-negative matrix factorization that I'm trying to port into Rcpp: fit_H <- function(W,H, num.iter){ for(i in 1:num.iter){ H <- 0.5*(H*(1+(crossprod(W,H)/tcrossprod(H,crossprod(H))))) …
zdebruine
  • 3,687
  • 6
  • 31
  • 50
3
votes
0 answers

R png()/pdf() doesn't work when running script but works if executing step by step

I'm creating a script to cluster my data in a server. I need to save the text output and the images as well. The text output works just fine but when I try to use the png() + plot() + dev.off() thing to save the plots, no image is created. [ADDED…
TheDuckman
  • 31
  • 4
3
votes
1 answer

Reconstruction error on test set for NMF (aka NNMF) in scikit-learn

I am performing topic extraction on natural language data using NMF (aka NNMF) from scikit-learn. I am trying to optimize the number of clusters (aka components). In order to do this, I need to calculate the reconstruction error. However, using…
user179041
  • 136
  • 1
  • 7
1
2 3 4 5 6