0

I have a really big similarity matrix having 444 columns. I want to plot a heatmap or corrplot to compare different similarity matrices, but I can't use all the columns. I want to take a random sample of columns and then plot a heatmap, but I don't want to compute similarities again for this columns as it takes a lot of time for some similarity functions that I have. Any ideas how I could take a random sample of columns from similarity matrix (it has the same structure as correlation matrix) to plot a heatmap for them?

v_toria
  • 3
  • 2
  • Please consider producing a [minimal reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). I presume your similarity matrix is square, symmetric 444x444, in which case you could try `sample` function to select random indices of the matrix. How many random samples (and what size) would you like to draw? – TWL Mar 06 '14 at 11:18
  • I want to draw about 20x20 matrix, it can be triangle, but have to keep correlation matrix structure. So you think that it'd be more efficient to compute new similarity matrix for that random sample and plot it than take a sample from existent 444x444 matrix? – v_toria Mar 06 '14 at 12:10
  • Both options are feasible, but the solution depends on the similarity measure that you are using, the dataset, and your end goal. These are unclear at the moment. Before we recommend anything, please refine your question and provide a sample dataset. – TWL Mar 06 '14 at 14:51
  • Columns of data matrix corresponds to different products and rows to different users, matrix is filled by users ratings for items and it is very sparse. Similarity between matrix columns is measured by cosine, correlation-based and adjusted cosine similarity measures so actually I have 3 444x444 matrices. I want to take a random sample of products same for all similarity matrices, plot heatmaps and compare results. – v_toria Mar 06 '14 at 19:59

0 Answers0