1

''' I did a clustermap with thousands of genes, using seaborn. Because, I'm interested in only few genes, I'd like to display those genes of interest on the ytick. I'm trying to figure it out using the iris dataset. Please find below my code. I'm not sure how to get the samples of interest at their right indexes. Thank you in advance for helpful assistance.

'''

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

iris = sns.load_dataset('iris')
samples = ['sample_'+str(x) for x in list(iris.index)] #creating sample ID lining up with the internal index.[![enter image description here][1]][1]
iris.insert(0,'Sample_ID',samples) 
samples_of_interest = ['sample_41','sample_34','sample_114','sample_55'] #samples to be visible on ytick

sns.clustermap(iris.iloc[:,1:-1],yticklabels=samples_of_interest) #Not giving the expected result as all of thesmples of interest are not at their right index

plt.show()
plt.close()
Amilovsky
  • 397
  • 6
  • 15

1 Answers1

1

Here's why your answer wasn't working:

See this about the yticklabels argument in the documentation:

If list-like, plot these alternate labels as the xticklabels.

So basically when you only pass a few tick labels, it is just setting those names as the tick labels, without knowledge of the tick positions. One way to get around this is to do the following, adding sample_labels which makes a label for all ticks, but sets non-interesting ones to None. You then follow this answer to rotate the ticks):

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

iris = sns.load_dataset('iris')
samples = ['sample_'+str(x) for x in list(iris.index)]
iris.insert(0,'Sample_ID',samples) 
samples_of_interest = ['sample_41','sample_34','sample_114','sample_55']

sample_labels = [i if i in samples_of_interest else None
                 for i in iris['Sample_ID'] ]

cm=sns.clustermap(iris.iloc[:,1:-1], yticklabels=sample_labels)
plt.setp(cm.ax_heatmap.yaxis.get_majorticklabels(), rotation=0)

enter image description here

But this is still not ideal b/c there are ticks for all the positions I'm sure there is a way to edit this but instead..

Here's a method I like more:

Get the new order of the samples from the clustergrid (object returned by clustermap, then manually set the y-tick labels and positions (with credit to this post):

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

iris = sns.load_dataset('iris')

samples_of_interest = [41, 34, 114, 55]
sample_names = ['Sample ' + str(i) for i in samples_of_interest]

cm=sns.clustermap(iris.iloc[:,:-1])  #note the loc has changed!

reorder = cm.dendrogram_row.reordered_ind
new_positions = [reorder.index(i) for i in samples_of_interest]
plt.setp(cm.ax_heatmap.yaxis.set_ticks(new_positions))
plt.setp(cm.ax_heatmap.yaxis.set_ticklabels(sample_names))

enter image description here

Oddly the cm.ax_heatmap.yaxis.set... commands print out the get versions (it seems), but this doesn't affect outcome

Tom
  • 8,310
  • 2
  • 16
  • 36