I am doing a clustering task and I have a distance matrix. I wish to visualize this distance matrix as a 2D graph. Please let me know if there is any way to do it online or in programming languages like R or python.
My distance matrix is as follows,
I used the classical Multidimensional scaling functionality (in R) and obtained a 2D plot that looks like:
But What I am looking for is a graph with nodes and weighted edges running between them.

- 371
- 1
- 7
- 16
-
@Anony-Mousse: I tried quite a few things mentioned in the answers (like http://vida.io, python code snippets - both in vain). I hope I can get the python code to work soon. Also, I am parallely doing some reading (pertaining my work), so if you want me to update this post/ accept an answer so soon, it's NOT POSSIBLE. – Annamalai N Sep 23 '13 at 05:08
-
Well, the MDS graph is a good start. Then add Delauney triangulation or use some other heuristic to add edges. – Has QUIT--Anony-Mousse Sep 23 '13 at 09:17
-
OK, point taken, I will update this question, when I am able to get the edges. – Annamalai N Sep 23 '13 at 09:49
4 Answers
Possibility 1
I assume, that you want a 2dimensional graph, where distances between nodes positions are the same as provided by your table.
In python, you can use networkx
for such applications. In general there are manymethods of doing so, remember, that all of them are just approximations (as in general it is not possible to create a 2 dimensional representataion of points given their pairwise distances) They are some kind of stress-minimizatin (or energy-minimization) approximations, trying to find the "reasonable" representation with similar distances as those provided.
As an example you can consider a four point example (with correct, discrete metric applied):
p1 p2 p3 p4
---------------
p1 0 1 1 1
p2 1 0 1 1
p3 1 1 0 1
p4 1 1 1 0
In general, drawing actual "graph" is redundant, as you have fully connected one (each pair of nodes is connected) so it should be sufficient to draw just points.
import networkx as nx
import numpy as np
import string
dt = [('len', float)]
A = np.array([(0, 0.3, 0.4, 0.7),
(0.3, 0, 0.9, 0.2),
(0.4, 0.9, 0, 0.1),
(0.7, 0.2, 0.1, 0)
])*10
A = A.view(dt)
G = nx.from_numpy_matrix(A)
G = nx.relabel_nodes(G, dict(zip(range(len(G.nodes())),string.ascii_uppercase)))
G = nx.to_agraph(G)
G.node_attr.update(color="red", style="filled")
G.edge_attr.update(color="blue", width="2.0")
G.draw('distances.png', format='png', prog='neato')
In R you can try multidimensional scaling
# Classical MDS
# N rows (objects) x p columns (variables)
# each row identified by a unique row name
d <- dist(mydata) # euclidean distances between the rows
fit <- cmdscale(d,eig=TRUE, k=2) # k is the number of dim
fit # view results
# plot solution
x <- fit$points[,1]
y <- fit$points[,2]
plot(x, y, xlab="Coordinate 1", ylab="Coordinate 2",
main="Metric MDS", type="n")
text(x, y, labels = row.names(mydata), cex=.7)
Possibility 2
You just want to draw a graph with labeled edges
Again, networkx
can help:
import networkx as nx
# Create a graph
G = nx.Graph()
# distances
D = [ [0, 1], [1, 0] ]
labels = {}
for n in range(len(D)):
for m in range(len(D)-(n+1)):
G.add_edge(n,n+m+1)
labels[ (n,n+m+1) ] = str(D[n][n+m+1])
pos=nx.spring_layout(G)
nx.draw(G, pos)
nx.draw_networkx_edge_labels(G,pos,edge_labels=labels,font_size=30)
import pylab as plt
plt.show()
-
Is networkx not compatible with python 2.7? I get the following error :( File "test.py", line 16, in
G = nx.to_agraph(G) File "/usr/local/lib/python2.7/dist-packages/networkx/drawing/nx_agraph.py", line 134, in to_agraph '(not available for Python3)') – Annamalai N Sep 21 '13 at 05:50 -
1It works fine with python 2.7. Try installing http://pygraphviz.github.io/ which this function uses – lejlot Sep 21 '13 at 05:54
Multidimensional scaling (MDS) is exactly what you want. See here and here for more.

- 3,965
- 4
- 27
- 35
-
Thanks for the answer, this is not what I was looking for, but looks interesting. I wanted to visualize that matrix as a graph with nodes and weighted edges running between the nodes. Also, I found a R example implementation for MDS [here](http://personality-project.org/r/mds.html). I shall update you further shortly on this. – Annamalai N Sep 23 '13 at 06:04
You did not mentioned if you want a 2 dimensional graph or not. I suppose that you want to build a graph on 2 dimensions due to the fact that you need that for visualization. Considering that you have to be aware that for the most of the graphs this is simply not possible.
What can be probably done is to approximate somehow the values from distance matrix, something like small values to have relative small edges and big values to have a relative big length.
With all previous considerations one option would be graphviz. See neato function. In general what you are interested in is force-directed drawing. See wikipedia for further reference.

- 121
- 3
You can use d3js Force Directed Graph and configure distance between nodes. d3js force layout has some clustering capability to separate nodes with similar distances. Here's an example with values as distance between nodes:
http://vida.io/documents/SyT7DREdQmGSpsBkK
Another way to visualize is to use same distance between nodes but different line thickness. In that case, you'd want to calculate stroke-width based on values:
.style("stroke-width", function(d) { return Math.sqrt(d.value / 50); });

- 1,344
- 10
- 14