9

I'm trying to store an entire matrix/array into a single cell of a data frame, but can't quite remember how to do it.

Now before you say it can't be done, I'm sure I remember someone asking a question on SO where it was done, although that wasn't the point of the question so I can't find it again.

For example, you can store matrices inti a single cell of a matrix like so:

myMat <- array(list(), dim=c(2, 2))
myMat[[1, 1]] <- 1:5
myMat[[1, 2]] <- 6:10

#     [,1]      [,2]     
#[1,] Integer,5 Integer,5
#[2,] NULL      NULL

The trick was in using the double brackets [[]].

Now I just can't work out how to do it for a data frame (or if you can):

# attempt to make a dataframe like above (except if I use list() it gets
# interpreted to mean the `m` column doesn't exist)
myDF <- data.frame(i=1:5, m=NA)
myDF[[1, 'm']] <- 1:5
# Error in `[[<-.data.frame`(`*tmp*`, 1, "m", value = 1:5) : 
#  more elements supplied than there are to replace

# this seems to work but I have to do myDF$m[[1]][[1]] to get the 1:5,
# whereas I just want to do myDF$m[[1]].
myDF[[1, 'm']] <- list(1:5)

I think I'm almost there. With that last attempt I can do myDF[[1, 'm']] to retrieve list(1:5) and hence myDF[[1, 'm']][[1]] to get 1:5, but I'd prefer to just do myDF[[1, 'm']] and get 1:5.

mathematical.coffee
  • 55,977
  • 11
  • 154
  • 194
  • Something like: `dat<-data.frame(cars, m=I(matrix(rnorm(10*nrow(cars)), nrow(cars)))); dat[["m"]]`? – sebastian-c Nov 22 '12 at 02:03
  • @sebastian-c no, I'm wanting the matrix in `dat[[i, 'm']]` for each `i` being a row, rather than `dat[['m']]` being the matrix. – mathematical.coffee Nov 22 '12 at 03:08
  • While it is possible, I'd advise against it - a lot of the internal data frame code assumes columns are atomic vectors and breaks when you input a list. Every time I've put a list inside a data frame I've ended up regretting it. – hadley Nov 22 '12 at 13:37

2 Answers2

6

I think I worked it out. It is important to initialise the data frame such that the column is ready to accept matrices.

To do this you give it a list data type. Note the I to protect the list().

myDF <- data.frame(i=integer(), m=I(list()))

Then you can add rows as usual

myDF[1, 'i'] <- 1

and then add the matrix in with [[]] notation

myDF[[1, 'm']] <- matrix(rnorm(9), 3, 3)

Access with [[]] notation:

> myDF$m[[1]]
          [,1]       [,2]       [,3]
[1,] 0.3307403 -0.2031316  1.5995385
[2,] 0.4588922  0.1631086 -0.2754463
[3,] 0.0568791  1.0358552 -0.1623794

To initialise with non-zero rows you can do (note the I to protect the vector and the vector('list', 5) to initialise an empty list of length 5 to avoid wasting memory):

myDF <- data.frame(i=1:5, m=I(vector('list', 5)))
myDF$m[[1]] <- matrix(rnorm(9), 3, 3)
mathematical.coffee
  • 55,977
  • 11
  • 154
  • 194
4

I think the trick may be to insert it in as a list:

set.seed(123)
dat <- data.frame(women, m=I(replicate(nrow(women), matrix(rnorm(4), 2, 2), 
                simplify=FALSE)))


str(dat)
'data.frame':   15 obs. of  3 variables:
 $ height: num  58 59 60 61 62 63 64 65 66 67 ...
 $ weight: num  115 117 120 123 126 129 132 135 139 142 ...
 $ m     :List of 15
  ..$ : num [1:2, 1:2] -0.5605 -0.2302 1.5587 0.0705
  ..$ : num [1:2, 1:2] 0.129 1.715 0.461 -1.265
  ...
  ..$ : num [1:2, 1:2] -1.549 0.585 0.124 0.216
  ..- attr(*, "class")= chr "AsIs"

dat[[1, "m"]]
           [,1]       [,2]
[1,] -0.5604756 1.55870831
[2,] -0.2301775 0.07050839

dat[[2, "m"]]
          [,1]       [,2]
[1,] 0.1292877  0.4609162
[2,] 1.7150650 -1.2650612

EDIT: So the question really is about initialising and then assigning. Given that, you should be able to define a data.frame like the one in your question like so:

data.frame(i=1:5, m=I(vector(mode="list", length=5)))

You can then assign to it like so:

dat[[2, "m"]] <- matrix(rnorm(9), 3, 3)
sebastian-c
  • 15,057
  • 3
  • 47
  • 93
  • But how do I assign on a *per-cell* basis, as this requires me to know every value upon creation, and assigning `dat[[1, 'm']] <- list(matrix...)` still requires a `dat$m[[1]][[1]]` to access rather than `dat$m[[1]]`? – mathematical.coffee Nov 22 '12 at 03:28
  • `dat[[i, "m"]] <- matrix(rnorm(4), 2, 2)` works as does `dat[[i, "m"]] <- matrix(rnorm(9), 3, 3)`. – sebastian-c Nov 22 '12 at 03:31
  • But if I create dat as `dat <- data.frame(i=1:5, m=NA)`, assignment in this way doesn't work. I think it must have to be initialised appropriately (and I don't know the values of `m` to put in in advance) – mathematical.coffee Nov 22 '12 at 03:32
  • Yeah, worked it out. it is important that the data frame be *initialised* with `list()` as the per-cell data type, and then assignment works. Best off to initialise it as 0 rows too since I don't know in advance the matrices for that column. – mathematical.coffee Nov 22 '12 at 03:36
  • @mathematical.coffee I was about to suggest `data.frame(i=1:5, m=I(vector(mode="list", length=5))); dat[[2, "m"]] <- matrix(rnorm(9), 3, 3)`. Given that it's a data frame, you should know the number of rows. EDIT: Having seen your solution, you essentially initialise them as being of both length 0. – sebastian-c Nov 22 '12 at 03:38
  • thanks! your use of `I()` and realising that I could do `<- matrix` to your example but not mine is what made me realise it should be initialised appropriately. If you want to edit your answer to include your above comment I'm happy to accept! – mathematical.coffee Nov 22 '12 at 03:41