5

I want to converting R data.frame to matrix with levels of two factors as row and column names of the matrix. Here is a MWE. It is a lot of code to get the desired result and there might be more compact code for this purpose.

set.seed(12345)
A <- c("A1", "A2")
B <- c("B1", "B2", "B3")
Y <- runif(n=6, min=100, max=1000)
df <- data.frame(expand.grid(A=A, B=B), Y)
df

#    A  B        Y
# 1 A1 B1 748.8135
# 2 A2 B1 888.1959
# 3 A1 B2 784.8841
# 4 A2 B2 897.5121
# 5 A1 B3 510.8329
# 6 A2 B3 249.7346

library(tidyr)
df1 <- spread(data = df, key = A, value = Y, fill = NA, convert = FALSE, drop = TRUE)
df1

#   B       A1       A2
# 1 B1 748.8135 888.1959
# 2 B2 784.8841 897.5121
# 3 B3 510.8329 249.7346


m1 <- as.matrix(df1[,-1])
rownames(m1) <- df1[ ,1]
m1

#     A1       A2
# B1 748.8135 888.1959
# B2 784.8841 897.5121
# B3 510.8329 249.7346
halfer
  • 19,824
  • 17
  • 99
  • 186
MYaseen208
  • 22,666
  • 37
  • 165
  • 309
  • Allocate an appropriate (`dim` and `dimnames`) "matrix" and use something like `mymat[as.matrix(df[c("B", "A")])] = df[["Y"]]`. Also, `?xtabs` – alexis_laz Dec 20 '15 at 07:23
  • Thanks @alexis_laz for your comment. Can you change your comment to complete answer. Thanks – MYaseen208 Dec 20 '15 at 07:36
  • I guess [this](http://stackoverflow.com/questions/9617348/reshape-three-column-data-frame-to-matrix) is the complete post for more alternatives. – alexis_laz Dec 20 '15 at 07:43

1 Answers1

3

Can be done with acast function from reshape2 package.

df4 <- reshape2::acast(df, B ~ A, value.var="Y")
df4

#       A1       A2
# B1 748.8135 888.1959
# B2 784.8841 897.5121
# B3 510.8329 249.7346
MYaseen208
  • 22,666
  • 37
  • 165
  • 309
  • Good use of `acast`, plus 1 – akrun Dec 20 '15 at 08:24
  • Thanks @akrun for your appreciation. For my actual data, I came across a problem. My ending object contains some `NAs` and now I want to select only those columns of `df4` which don't contain `NAs`. Any thoughts – MYaseen208 Dec 20 '15 at 08:29
  • You may try `df4[!colSums(is.na(df4))]` to filter out the columns with no NAs – akrun Dec 20 '15 at 08:31
  • Thanks again @akrun for your help. I got the desired result through `df4[, complete.cases(t(df4))]`. – MYaseen208 Dec 20 '15 at 08:35