0

I have a groups of vectors and matrices, they can be classified into 5 subgroups. As I want to more easily and neatly manage those vectors and matrices, can I put several vectors or several matrices into one dataframe? if it is allowed to do that, will the memory consumption be larger (than leaving them as individual vectors and matrices) if i group them into one dataframe?

Thank you for your advice!

Gavin Simpson
  • 170,508
  • 25
  • 396
  • 453
Joyce
  • 2,517
  • 8
  • 21
  • 21
  • Also, are all objects the same length? For instance, you can do: `x = data.frame(A = 1:2, B = matrix(1:10, nrow=2, ncol=5))`, but you will have trouble doing `x = data.frame(A = 1:2, B = matrix(1:10, nrow=5, ncol=2))`; however, `x = list(A = 1:2, B = matrix(1:10, nrow=5, ncol=2))` would not be a problem. – A5C1D2H2I1M1N2O1R2T1 Jul 25 '12 at 06:23
  • Sorry for the triple-comment post, but a minor correction: `x = data.frame(A = 1:2); x$B = matrix(1:10, nrow=2, ncol=5)` works in terms of retaining the matrix structure in a `data.frame`. – A5C1D2H2I1M1N2O1R2T1 Jul 25 '12 at 06:38
  • Thank you for the suggestion, looks like list will fit what I need. However, will the memory to be consumed same before and after packing those objects into list? (of course assuming I removed the original objects after packing them into lists) – Joyce Jul 25 '12 at 06:41

1 Answers1

2

Building on the what we were discussing in the comments above, here is an example that you should be able to reproduce. Be sure to save all of your work first, because this example deletes the objects in your current workspace.

## SAVE ANY WORK YOU NEED TO BEFORE DOING THIS!
##
## Start with a clean workspace
##
rm(list=ls())
ls()
set.seed(1)
## Make up some data
A = rnorm(10000)
B = sample(letters, 10000, replace=TRUE)
C = matrix(50000, nrow=10000, ncol=5)
## The same data as a data.frame
temp.df = data.frame(A = A)
temp.df$B = B
temp.df$C = C
## The same data as a list
temp.list = list(A, B, C)
##
## How big is each object?
##    
sort( sapply(ls(), function(x) { object.size(get(x)) }) )
#     A         B         C temp.list   temp.df 
# 80040     81288    400200    561600    562304 
sum(sort( sapply(ls(), function(x) { object.size(get(x)) }) )[1:3])
# [1] 561528

You can see that the difference in size is marginal, whether you are collecting your objects as a list (recommended) or a data.frame (not recommended for practical purposes, though a data.frame is a list with a class of data.frame.

See also: here and here.

Community
  • 1
  • 1
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485