In Ruby, most objects require memory to store their class and instance variables. Once this memory is allocated, Ruby represents each object by this memory location. When the object is assigned to a variable or passed to a function, it is the location of this memory that is passed, not the data at this memory. Singleton methods make use of this. When you define a singleton method, Ruby silently replaces the objects class with a new singleton class. Because each object stores its class, Ruby can easily replace an object's class with a new class that implements the singleton methods (and inherits from the original class).
This is no longer true for objects that are immediate values: true
, false
, nil
, all symbols, and integers that are small enough to fit within a Fixnum. Ruby does not allocate memory for instances of these objects, it does not internally represent the objects as a location in memory. Instead, it infers the instance of the object based on its internal representation. What this means is twofold:
The class of each object is no longer stored in memory at a particular location, and is instead implicitly determined by the type of immediate object. This is why Fixnums cannot have singleton methods.
Immediate objects with the same state (e.g., two Fixnums of integer 2378) are actually the same instance. This is because the instance is determined by this state.
To get a better sense of this, consider the following operations on a Fixnum:
>> x = 3 + 7
=> 10
>> x.object_id == 10.object_id
=> true
>> x.object_id == (15-5).object_id
=> true
Now, consider them using strings:
>> x = "a" + "b"
=> "ab"
>> x.object_id == "ab".object_id
=> false
>> x.object_id == "Xab"[1...3].object_id
=> false
>> x == "ab"
=> true
>> x == "Xab"[1...3]
=> true
The reason the object ids of the Fixnums are equal is that they're immediate objects with the same internal representation. The strings, on the other hand, exist in allocated memory. The object id of each string is the location of its object state in memory.
Some low-level information
To understand this, you have to understand how Ruby (at least 1.8 and 1.9) treat Fixnums internally. In Ruby, all objects are represented in C code by variables of type VALUE
. Ruby imposes the following requirements for VALUE
:
The type VALUE is is the smallest integer of sufficient size to hold a pointer. This means, in C, that sizeof(VALUE) == sizeof(void*)
.
Any non-immediate object must be aligned on a 4-byte boundary. This means that any object allocated by Ruby will have address 4*i
for some integer i
. This also means that all pointers have zero values in their two least significant bits.
The first requirement allows Ruby to store both pointers to objects and immediate values in a variable of type VALUE
. The second requirement allows Ruby to detect Fixnum and Symbol objects based on the two least significant bits.
To make this more concrete, consider the internal binary representation of a Ruby object z
, which we'll call Rz
in a 32-bit architecture:
MSB LSB
3 2 1
1098 7654 3210 9876 5432 1098 7654 32 10
XXXX XXXX XXXX XXXX XXXX XXXX XXXX AB CD
Ruby then interprets Rz
, the representation of z
, as follows:
If D==1
, then z
is a Fixnum. The integer value of this Fixnum is stored in the upper 31 bits of the representation, and is recovered by performing an arithmetic right shift to recover the signed integer stored in these bits.
Three special representations are tested (all with D==0
)
- if
Rz==0
, then z
is false
- if
Rz==2
, then z
is true
- if
Rz==4
, then z
is nil
If ABCD == 1110
, then 'z' is a Symbol. The symbol is converted into a unique ID by right-shifting the eight least-significant bits (i.e., z>>8
in C). On 32-bit architectures, this allows 2^24 different IDs (over 10 million). On 64-bit architectures, this allows 2^48 different IDs.
Otherwise, Rz
represents an address in memory for an instance of a Ruby object, and the type of z
is determined by the class information at that location.