0

I need to get the heights of the nodes off a dendrogram my plan was to use the dcoord/icoord but physically regarding the heights from the resulting plot and the values in the respective 2d arrays.

according to this post, dcoord and icoord reflect the coordinates of the nodes in the tree. in the scipy.cluster.hierarchy.dendrogram documentation, the meaning of icoord/dcoord are pretty nondescript. so im not able to backtrack what's going on

would appreciate any insights

import numpy as np
import pandas as pd
from scipy.cluster import hierarchy
import matplotlib.pyplot as plt

data = [[24, 16], [13, 4], [24, 11], [34, 18], [41, 
6], [35, 13]]
frame = pd.DataFrame(np.array(data),
                     columns=["a", "b"],
                     index=["Atlanta", "Boston", "Chicago", "Dallas", "Denver", "Detroit"])

Z = hierarchy.linkage(frame, 'single')
plt.figure()
dn = hierarchy.dendrogram(Z, labels=frame.index)

the heights would be something like [15, 10, 9, 5], but I want python to give this array to me, not me eyeballing it.

wiscoYogi
  • 305
  • 2
  • 10
  • Take a look at ["how to plot and annotate hierarchical clustering dendrograms in scipy/matplotlib"](https://stackoverflow.com/questions/11917779/how-to-plot-and-annotate-hierarchical-clustering-dendrograms-in-scipy-matplotlib). In particular, check my answer for code that annotates the dendrogram with the distance between pairs of nodes. – Warren Weckesser Jul 11 '21 at 07:02

1 Answers1

0

You can pull that information from the second or third element of the lists in dn['dcoord'].

Here, I have already run your script. The y-coordinates are taken from index 1 of each list in dn['dcoord']:

In [190]: [y[1] for y in dn['dcoord']]
Out[190]: 
[5.0,
 5.0990195135927845,
 9.219544457292887,
 10.198039027185569,
 13.038404810405298]

As I noted in a comment, you can see this data being used to augment a dendrogram in this answer.

Warren Weckesser
  • 110,654
  • 19
  • 194
  • 214
  • I just got it working with x, y from the example you cited, now im seeing this too! thanks for your post! im wondering what the physical meaning of the columns in the dcoord and icoord 2d arrays are -- why is it that `[y[1] for y in dn['dcoord']] == [y[2] for y in dn['dcoord']]`? – wiscoYogi Jul 11 '21 at 07:31