18

I got a distance matrix with the following steps:

x <- read.table(textConnection('
     t0 t1 t2
 aaa  0  1  0
 bbb  1  0  1
 ccc  1  1  1
 ddd  1  1  0
 ' ), header=TRUE)

As such x is a data frame with column and row headers

    t0 t1 t2
aaa  0  1  0
bbb  1  0  1
ccc  1  1  1
ddd  1  1  0

require(vegan)
d <- vegdist(x, method="jaccard")

The distance matrix d is obtained as follows:

          aaa       bbb       ccc
bbb 1.0000000                    
ccc 0.6666667 0.3333333          
ddd 0.5000000 0.6666667 0.3333333

By typing str(d), I found it is not a ordinary table nor csv format.

Class 'dist'  atomic [1:6] 1 0.667 0.5 0.333 0.667 ...
  ..- attr(*, "Size")= int 4
  ..- attr(*, "Labels")= chr [1:4] "aaa" "bbb" "ccc" "ddd"
  ..- attr(*, "Diag")= logi FALSE
  ..- attr(*, "Upper")= logi FALSE
  ..- attr(*, "method")= chr "jaccard"
  ..- attr(*, "call")= language vegdist(x = a, method = "jaccard")

I want to covert the distance matrix to a 3 columns with new headers and save it as a csv file as follows:

c1  c2  distance
aaa bbb 1.000
aaa ccc 0.6666667
aaa ddd 0.5
bbb ccc 0.3333333
bbb ddd 0.6666667
ccc ddd 0.3333333
Gavin Simpson
  • 170,508
  • 25
  • 396
  • 453
Catherine
  • 5,345
  • 11
  • 30
  • 28
  • 6
    This is a much better quality Q than several of those you have posted recently. Couple of points: i) what you call a table is a data frame in R. A table in R is something else. ii) Please, when you can, go back through your Qs and accept answers where you haven't already. iii) Please do respond to comments that users post on your Qs. It is not supposed to be one-way traffic you asking, us providing answers. – Gavin Simpson Apr 28 '11 at 08:48

2 Answers2

32

This is quite doable using base R functions. First we want all pairwise combinations of the rows to fill the columns c1 and c2 in the resulting object. The final column distance is achieved by simply converting the "dist" object d into a numeric vector (it already is a vector but of a different class).

The first step is done using combn(rownames(x), 2) and the second step via as.numeric(d):

m <- data.frame(t(combn(rownames(x),2)), as.numeric(d))
names(m) <- c("c1", "c2", "distance")

Which gives:

> m
   c1  c2  distance
1 aaa bbb 1.0000000
2 aaa ccc 0.6666667
3 aaa ddd 0.5000000
4 bbb ccc 0.3333333
5 bbb ddd 0.6666667
6 ccc ddd 0.3333333

To save as a CSV file, write.csv(m, file = "filename.csv").

Gavin Simpson
  • 170,508
  • 25
  • 396
  • 453
  • 1
    ...and how do you convert it back to the original class(dist)? – theforestecologist Jan 26 '16 at 06:47
  • I'm curious about the other way around too. Found this guy providing a solution via reshape: http://stackoverflow.com/questions/2126108/convert-table-into-matrix-by-column-names – Marco Virgolin Jul 21 '16 at 15:05
  • @MarcoVirgolin Look at the `acast()` function in package *reshape2* for one way to go to the wide format (i.e. full distance matrix), then us `as.dist()` to convert that matrix to a dist object. – Gavin Simpson Jul 21 '16 at 15:26
24

you can do this by combining melt from reshape package, upper.tri etc.:

> library(reshape)
> m <- as.matrix(d)
> m
          aaa       bbb       ccc       ddd
aaa 0.0000000 1.0000000 0.6666667 0.5000000
bbb 1.0000000 0.0000000 0.3333333 0.6666667
ccc 0.6666667 0.3333333 0.0000000 0.3333333
ddd 0.5000000 0.6666667 0.3333333 0.0000000
> m2 <- melt(m)[melt(upper.tri(m))$value,]
> names(m2) <- c("c1", "c2", "distance")
> m2
    c1  c2  distance
5  aaa bbb 1.0000000
9  aaa ccc 0.6666667
10 bbb ccc 0.3333333
13 aaa ddd 0.5000000
14 bbb ddd 0.6666667
15 ccc ddd 0.3333333
kohske
  • 65,572
  • 8
  • 165
  • 155