2

Is there a simple way to get a size of the record (in terms of the disk space it takes) with activerecord (my db is mysql)?

I have found several (1, 2) answers offering a way of doing it with sql, I wonder if there is something built in into Activerecord, like MyRecord.first.bytesize.

Size of the table would also do, I need an average row size.

The purpose of this is estimating disk space requirements for a database.


UPDATE

I have also found ObjectSpace.memsize_of

require 'objspace'
ObjectSpace.memsize_of(MyRecord.first)

Is the size of the activerecord object equal to the size of the record in the database?


This seems to give the combined size of all the fields. I am casting to String all the fields that do not respond to size, such as timestamps.

record.attributes.keys.map{ |attr_name| record.send(attr_name)}.inject{|sum, attr| sum + (attr.respond_to?(:size) ? attr.size : attr.to_s.size)  }

And here is the comparison of results of these methods for the same record:

  • combined size of all attributes (see above): 222
  • Marshal::dum(record).size: 2678
  • ObjectSpace.memsize_of(record): 128
Community
  • 1
  • 1
bosskovic
  • 2,034
  • 1
  • 14
  • 29

1 Answers1

2

An approximation of size can be found by using:

Marshal::dump(obj).size

Since you said, its for database sizing - The dump has name of classes and instance variables - it may be bit of overhead compared to absolute record size.

You may want to look at this answer for getting MySQL DB size - Database size calculation? and How to Get True Size of MySQL Database?

Based on number of entities currently stored in DB, and size thus determined using above methods - you can extrapolate for sizing.

Community
  • 1
  • 1
Wand Maker
  • 18,476
  • 8
  • 53
  • 87
  • 1
    what is the difference between `Marshal::dump(obj).size` and `ObjectSpace.memsize_of(obj)` ? The former returns 10+ time bigger size. – bosskovic Dec 14 '15 at 13:17
  • I am not very sure. Lets see if anybody else can throw some light. – Wand Maker Dec 14 '15 at 13:24
  • 1
    `ObjectSpace.memsize_of` will be an accurate representation of the size of the Ruby object in memory, and more accurate than `Marshal`. Neither will be the same as the size of the database row. – Jordan Running Dec 14 '15 at 15:37
  • Marshal.dump will include all the instance variables, eg those related to change tracking, associations, various caches etc. Memsize_of counts ruby memory footprint (eg an empty array is 40 bytes on a 64 bit platform) and doesn't include a lot of things – Frederick Cheung Dec 14 '15 at 15:47
  • I thought so, that would explain the difference in size (when compared to other two approaches I listed). – bosskovic Dec 14 '15 at 15:51