1

MY DATA

I have a matrix Median that contains three qualities, Speed, Angle & Acceleration, in virtual 3D space. Each set of qualities belongs to an individual person, termed Class.

Speed<-c(18,21,25,19)
Angle<-c(90,45,90,120)
Acceleration<-c(4,5,9,4)
Class<-c("Nigel","Paul","Kelly","Steve")

Median = data.frame(Class,Speed,Angle,Acceleration)
mm = as.matrix(Median)

In the example above, Nigel's Speed, Angle and Acceleration qualities would be (18,90,4).

MY PROBLEM

I wish to know the euclidean distance between each individual person/class. For example, the euclidean distance between Nigel and Paul, Nigel and Kelly etc. I then wish to display the results in a dendrogram, as a result of hierarchical clustering.

WHAT I HAVE (UNSUCCESSFULLY) ATTEMPTED

I first used hc = hclust(dist(mm)) then plot(hc) although this results in a dendrogram of Speed only. It seems the function pdist() can compute distance between two matrices of observations, but I have three matrices. Is this possible in R? I am new to the language and have found a similar question in MATLAB here Calculating Euclidean distance of pairs of 3D points in matlab but how do I write this in R code?

Many thanks.

Community
  • 1
  • 1
user2716568
  • 1,866
  • 3
  • 23
  • 38

1 Answers1

1

When you transform your data.frame into a matrix, all values become characters, I don't think that is what you want... (moreover, you're trying to compute distance with the "class" names as one of the variables...)

The best would be to put your "Class" as row.names and then compute your distances and hclust :

mm<-Median[,-1]
row.names(mm)<-Median[,1]

Then you can compute the euclidean distances between Class with

dist(mm,method="euclidean") :

> dist(mm,method="euclidean")
          Nigel      Paul     Kelly
Paul  45.110974                    
Kelly  8.602325 45.354162          
Steve 30.016662 75.033326 31.000000

Finally, perform your hierarchical classification :

hac<-hclust(dist(mm,method="euclidean"))

and plot(hac,hang=-1) to display the dendrogram.

Cath
  • 23,906
  • 5
  • 52
  • 86
  • Thank you. When I compute the euclidean distances between Class, my matrix shows "0.0000" between Paul and Paul. I note this has is not present in your example. How do I remove the "0.0000" on the diagonal and repeated distances in the top section? – user2716568 Nov 21 '14 at 00:13
  • To answer my own question, I edited your section of code to `dist(mm,method="euclidean"), diag = FALSE, upper = FALSE)` which gave me what I was after. – user2716568 Nov 21 '14 at 01:35