4

Given the table below:

X = 

        col1    col2    col3
row1    "A"      "A"     "1.0"
row2    "A"      "B"     "0.9"
row3    "A"      "C"     "0.4"
row4    "B"      "A"     "0.9"
row5    "B"      "B"     "1.0"
row6    "B"      "C"     "0.2"
row7    "C"      "A"     "0.4"
row8    "C"      "B"     "0.2"
row9    "C"      "C"     "1.0"

Where col3 is a correlation measure between pairs of entities in col1 and col2.

How can I construct a matrix for which the column names are col1, the row names are col2, and the values in the cells of the matrix are populated by col3?

sgibb
  • 25,396
  • 3
  • 68
  • 74
user2737735
  • 43
  • 1
  • 4
  • Please read this on how to make a good reproducible example: http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Dason Sep 01 '13 at 17:47

2 Answers2

3
df <- read.table(textConnection('col1    col2    col3
row1    "A"      "A"     "1.0"
row2    "A"      "B"     "0.9"
row3    "A"      "C"     "0.4"
row4    "B"      "A"     "0.9"
row5    "B"      "B"     "1.0"
row6    "B"      "C"     "0.2"
row7    "C"      "A"     "0.4"
row8    "C"      "B"     "0.2"
row9    "C"      "C"     "1.0"'), header=T)

## fetch row/column indices
rows <- match(df$col1, LETTERS)
cols <- match(df$col2, LETTERS)

## create matrix
m <- matrix(0, nrow=max(rows), ncol=max(cols))

## fill matrix
m[cbind(rows, cols)] <- df$col3

m
#     [,1] [,2] [,3]
#[1,]  1.0  0.9  0.4
#[2,]  0.9  1.0  0.2
#[3,]  0.4  0.2  1.0
sgibb
  • 25,396
  • 3
  • 68
  • 74
3

Need some data to work with so I'll make some up.

# Make fake data
x <- c('A','B','C')
dat <- expand.grid(x, x)
dat$Var3 <- rnorm(9)

We can use base R to do this. I'm not very good with the 'reshape' function but you could do this. The column names would need to be cleaned up afterwards though

> reshape(dat, idvar = "Var1", timevar = "Var2", direction = "wide")
  Var1     Var3.A      Var3.B     Var3.C
1    A -1.2442937 -0.01132871 -0.5693153
2    B -1.6044295 -1.34907504  1.6778866
3    C  0.5393472 -1.00637345 -0.7694940

Alternatively you could use the dcast function from the reshape2 package. The output is a little cleaner I think.

> library(reshape2)
> dcast(dat, Var1 ~ Var2, value.var = "Var3")
  Var1          A           B          C
1    A -1.2442937 -0.01132871 -0.5693153
2    B -1.6044295 -1.34907504  1.6778866
3    C  0.5393472 -1.00637345 -0.7694940
Dason
  • 60,663
  • 9
  • 131
  • 148