Vectorizing a matrix

Question

I have a large 2D matrix that is 1000 x 1000. I want to reshape this so that it is one column (or row). For example, if the matrix was:

I want to turn it in to:

1 2 3 4 5 6 7 8 9

I do not need to preserve the column headers, just the order of the data. How do I do this using reshape2 (which is the package that I presumed was the easiest to use)?

Just to clarify, I mentioned reshape as I thought it was the best way of doing this. I can see that there are simpler methods which I am perfectly happy with.

Whenever you vectorize a matrix, keep in mind that it always goes columns first. When you need to preserve the row order, then do `c(t(some.matrix))`. — Joris Meys, Dec 31 '10 at 15:52
Changed the title to reflect the question asked. BTW, I wonder where that reshape-fetish is coming from. I see so many questions asking for a reshape answer to a problem for which reshape never was built in the first place. — Joris Meys, Dec 31 '10 at 15:55
@Joris perhaps "If all you have is a hammer, everything looks like a nail."? — Joshua Ulrich, Dec 31 '10 at 16:03
@Joris - ignorance really. I just assumed what I wanted to do was not a standard operation. I use ggplot2 where reshape2 is sometimes mentioned as they are both made by Hadley Wickham. — djq, Dec 31 '10 at 17:57

IRTFM · Accepted Answer · 2010-12-31T15:20:14.343

54

I think it will be difficult to find a more compact method than:

c(m)
[1] 1 2 3 4 5 6 7 8 9

However, if you want to retain a matrix structure, then this reworking of the dim attribute would be be effective:

dim(m) <- c(dim(m)[1]*dim(m)[2], 1)
m
      [,1]
 [1,]    1
 [2,]    2
 [3,]    3
 [4,]    4
 [5,]    5
 [6,]    6
 [7,]    7
 [8,]    8
 [9,]    9

There would be more compact methods of getting the product of the dimensions but the above method emphasizes that the dim attribute is a two element vector for matrices. Other ways of getting the "9" in that example include:

> prod(dim(m))
[1] 9
> length(m)
[1] 9

edited Dec 31 '10 at 15:20

answered Dec 31 '10 at 15:11

IRTFM

258,963
21
364
487

6

you can just do `cbind(c(m))` to make it a one-column matrix – Prasad Chalasani Dec 31 '10 at 15:53
2

@hadley OK, what about prod(dim(m))? – IRTFM Dec 31 '10 at 19:19
1

`dim(m) <- c(prod(dim(m)), 1)` is a bit nicer, and scales to any number of dimensions` – hadley Jan 04 '11 at 14:18
That was what I intended a reader to do. The code `prod(dim(m))` was offered as a replacement for the clunkier: `dim(m)[1]*dim(m)[2]` as a way of getting to 9. It was always intended to go into `dim(m)<-c(prod(dim(m)), 1)` and I guess that was why I couldn't figure out your comment. – IRTFM Jan 04 '11 at 15:15
For anyone with a `data.frame`, `unlist(df)` works. – kdauria Oct 21 '15 at 18:48

score 13 · Answer 2 · answered Dec 31 '10 at 14:57

13

A possible solution, but without using reshape2:

> m <- matrix(c(1:9), ncol = 3)
> m
     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]    3    6    9
> as.vector(m)
[1] 1 2 3 4 5 6 7 8 9

answered Dec 31 '10 at 14:57

EDi

13,160
2
48
57

3

as.vector(m) is about half the speed of c(m) - not that timing is likely to matter that much here. – Spacedman Dec 31 '10 at 19:17

score 11 · Answer 3 · answered Dec 31 '10 at 16:01

11

Come on R guys, lets give the OP a reshape2 solution:

> m <- matrix(c(1:9), ncol = 3)
> melt(m)$value
[1] 1 2 3 4 5 6 7 8 9

I just cant be bothered to test how much slower it is than c(m). It is the same, though:

> identical(c(m),melt(m)$value)
[1] TRUE

[EDIT: oh heck who am I kidding:]

> system.time(for(i in 1:1000){z=melt(m)$value})
   user  system elapsed 
  1.653   0.004   1.662 
> system.time(for(i in 1:1000){z=c(m)})
   user  system elapsed 
  0.004   0.000   0.004

answered Dec 31 '10 at 16:01

Spacedman

92,590
12
140
224

The reshape solution is several orders of magnitude slower when tested on a 1000 x 1000 matrix... as you can see via your edit. ;-) – Joshua Ulrich Dec 31 '10 at 16:10
+1 for the timings. funny reshape-hack though, I wouldn't have thought of it. For obvious reasons ;-) – Joris Meys Dec 31 '10 at 17:10
Just for amusement: reshape2::melt is about 25% faster than reshape::melt (approx. 7.7 vs 10.3 seconds for 10000 reps) although still about 400 times slower than c(m) ... – Ben Bolker Jan 01 '11 at 14:53

score 4 · Answer 4 · answered Jan 21 '14 at 10:43

as.vector(m) should be little more efficient then c(m):

> library(rbenchmark)
> m <- diag(5000)
> benchmark(
+   vect = as.vector(m), 
+   conc = c(m), 
+   replications=100
+ )
  test replications elapsed relative user.self sys.self user.child sys.child
2 conc          100  12.699    1.177     6.952    5.754          0         0
1 vect          100  10.785    1.000     4.858    5.933          0         0

score 0 · Answer 5 · answered Oct 16 '14 at 20:26

0

One more simple way to do it by using function "sapply" (or the same could be done with 'for' loop as well)

 m <- matrix(c(1:9), ncol = 3)
 (m1 <- as.numeric(sapply(1:NROW(m), function(i)(m[,i]))))

answered Oct 16 '14 at 20:26

user36478

346
6
14

Vectorizing a matrix

5 Answers5