0

I want to cluster the similarities of the elements of a matrix but my code produces the same dendrogram even if I change the values of the elements of matrix (in this case, the position of matrix elements changes but dendrogram doesn't change ). Do you know how I can fix the code?

Please run the code as it is. Then change both 0.91 to 0.11 and run the code again. You'll see what I mean.

Please compare both figures. You'll see that the positions of the matrix elements in the heatmap aren't the same for both figures. The positions of matrix elements in heatmap shouldn't change.

import numpy as np
import matplotlib
import matplotlib.pyplot as plt
from scipy.cluster.hierarchy import dendrogram, linkage
import seaborn as sns
import pandas as pd
from matplotlib import rcParams
from scipy.spatial.distance import pdist, squareform
import scipy.cluster.hierarchy as hcluster

methods = ["A", "B", "C"]

values = np.array([[0.00, 0.91, 0.73],
                    [0.91, 0.00, 0.24],
                    [0.73, 0.24, 0.00]])

kws = dict(cbar_kws=dict(ticks=[0, 0.50, 1.0], orientation='vertical'), figsize=(4, 4))
g = sns.clustermap(values, cmap="magma", row_cluster=True, col_cluster=True, yticklabels=True, xticklabels=True, **kws, dendrogram_ratio=(.1, .1), cbar_pos=(1.08, 0.10, 0.03, 0.78), vmin=0, vmax=1, annot=True, annot_kws={"fontsize":8, 'color':'w'}, linewidths=0, linecolor='white')
g.ax_cbar.set_ylabel("value)",size=10, rotation=90)
g.ax_cbar.yaxis.set_ticks_position("right")
g.ax_cbar.tick_params(labelsize=8)
g.ax_col_dendrogram.set_visible(False)
g.fig.suptitle('Title',size=8, y=0.93) 

plt.setp(g.ax_heatmap.set_xticklabels(methods), fontsize=8)
plt.setp(g.ax_heatmap.set_yticklabels(methods), fontsize=8, rotation=0)

plt.savefig("figure.png", dpi=300, bbox_inches='tight')

Plot with 0.91

enter image description here

Plot with 0.91 replaced with 0.11

enter image description here

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
qasim
  • 427
  • 2
  • 5
  • 14

1 Answers1

1
  • seaborn.clustermap adjusts the locations of the columns and the index on the plot axes to create the dendrogram.
  • g.ax_heatmap.set_xticklabels(methods) and g.ax_heatmap.set_yticklabels(methods) are incorrectly overwriting the x and y ticklabels. The new labels are not being mapped to the correct labels on the axes.
  • Converting the array to a pandas.DataFrame with column and index labels, will allow the seaborn.clustermap API to correctly map the ticklabels.
  • Tested in python 3.11, pandas 1.5.1, matplotlib 3.6.2, seaborn 0.12.1
import pandas as pd
import numpy as np

v1 = np.array([[0.00, 0.91, 0.73],
               [0.91, 0.00, 0.24],
               [0.73, 0.24, 0.00]])

v2 = np.array([[0.00, 0.11, 0.73],
               [0.11, 0.00, 0.24],
               [0.73, 0.24, 0.00]])

data = [v1, v2]

kws = dict(cbar_kws=dict(ticks=[0, 0.50, 1.0], orientation='vertical'), figsize=(4, 4))

for d in data:
    
    # since your data are the same on the columns and the index, use the same labels
    d = pd.DataFrame(data=d, columns=["A", "B", "C"], index=["A", "B", "C"])
    
    # new plot the dataframe
    g = sns.clustermap(d, cmap="magma", row_cluster=True, col_cluster=True, yticklabels=True, xticklabels=True, **kws,
                   dendrogram_ratio=(.1, .1), cbar_pos=(1.08, 0.10, 0.03, 0.78), vmin=0, vmax=1, annot=True,
                   annot_kws={"fontsize":8, 'color':'w'}, linewidths=0, linecolor='white')
    
    print('xticklabels: ', g.ax_heatmap.get_xticklabels())
    print('yticklabels: ', g.ax_heatmap.get_yticklabels())

Output

xticklabels:  [Text(0.5, 0, 'A'), Text(1.5, 0, 'B'), Text(2.5, 0, 'C')]
yticklabels:  [Text(1, 0.5, 'A'), Text(1, 1.5, 'B'), Text(1, 2.5, 'C')]

xticklabels:  [Text(0.5, 0, 'C'), Text(1.5, 0, 'A'), Text(2.5, 0, 'B')]
yticklabels:  [Text(1, 0.5, 'C'), Text(1, 1.5, 'A'), Text(1, 2.5, 'B')]

Plots

enter image description here

enter image description here

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
  • The tick locations and labels shouldn't change when a value of matrix is changed. Three diagonal elements should stay the same (zero). The positions of the matrix elements in heatmap shouldn't change as well. – qasim Nov 14 '22 at 20:32
  • @qasim that is not how the clustermap api works – Trenton McKinney Nov 14 '22 at 20:32
  • Dendrogram should change not the tick locations/labels and positions of the matrix elements. Please see this: https://stackoverflow.com/questions/29022451/dendrogram-through-scipy-given-a-similarity-matrix – qasim Nov 14 '22 at 20:36
  • @qasim I'm not arguing what it should or shouldn't do. I'm showing you what the API does, and why your clustermaps appear the same when the data changes. – Trenton McKinney Nov 14 '22 at 20:39