Ruby object to_s what is the encoding of the object id?

Question

In Ruby, the to_s on an object includes an encoding of the object's id.

[2] pry(main)> shape = Shape.new(4,4)
=> #<Shape:0x00007fac5eb6afc8 @num_sides=4, @side_length=4>

In the documentation it says

Returns a string representing obj. The default to_s prints the object’s class and an encoding of the object id. https://apidock.com/ruby/Object/to_s

In the example above, the encoding of the object id is 0x00007fac5eb6afc8.

In How does object_id assignment work? they explain

In MRI the object_id of an object is the same as the VALUE that represents the object on the C level.

So I compared to the object_id and it is not the same as the encoding of the object id.

[2] pry(main)> shape = Shape.new(4,4)
=> #<Shape:0x00007fac5eb6afc8 @num_sides=4, @side_length=4>
[3] pry(main)> shape.object_id
=> 70189150066660

What exactly is the encoding of the object id? It does not appear to be the object_id.

For non-integers, the printed value is twice the `object_id`, see https://stackoverflow.com/q/3430280/477037 — Stefan, Aug 17 '18 at 14:57

ForeverZer0 · Accepted Answer · 2018-08-17T17:46:43.297

Think of the object_id, or __id__ as the "pointer" for the object. It is not technically a pointer, but does contain a unique value that can be used to retrieve the internal C VALUE.

There are patterns to the value it has for some data types, as you can see with its hexadecimal representation with to_s. I am will not go into all the details, as there are already numerous answers on SO explaining, and already linked from comments, but integers (up to a FIXNUM_MAX, have predictable values, and special constants like true, false, and nil will always have the same object_id in every run.

To put simply, it is nothing more than a number, and shown as a hexadecimal (base 16) value, not any actual "encoding" or cypher.

Going to expand upon this a bit more in light of your latest edits to the question. As you posted, the hexadecimal number you see in to_s is the value of the internal C VALUE of the object. VALUE is a C data type (unsigned, pointer size number) that every Ruby object is represented as in C code. As @Stefan pointed out in a comment, for non-integer types (I speak only for MRI version), it is twice the value of the object_id. Not that you probably care, but you can shift the bits of an integer to predict the value for those.

Therefore, using you example.

A value of 0x00007fac5eb6afc8 is simple hexadecimal notation for a number. It uses a base 16 counting system as opposed to the base 10 decimal system we are more used to in everyday life. It is simply a different way of looking at the same number.

So, using that logic.

a = 0x00007fac5eb6afc8
#=> 140378300133320 # Decimal representation

a /= 2 # Remember, non-integers are half of this value
#=> 70189150066660   # Your object_id

I wonder why the Ruby `object_id` divides the C `VALUE` by two. — BobRodes, Oct 08 '19 at 06:28

score 0 · Answer 2 · answered Aug 17 '18 at 16:01

The best answer you can get is: You don't know, and you shouldn't need to.

Ruby guarantees exactly three things about object IDs:

An object has the same ID during its lifetime.
No two objects have the same ID at the same time.
IDs are integers.

In particular, this means that you cannot rely on a specific object having a specific ID (for example, nil having ID 8). It also means that IDs can be re-used. You should think of it as nothing but opaque identifier.

And, as you quoted, the default Object#to_s uses "some" encoding of the ID.

And that is all you know, and all you should ever rely on. In particular, you should never try to parse IDs or Object#to_s.

So, the ID part of Object#to_s is "some unspecified encoding" of the ID, which itself is "some opaque identifier".

Everything else is deliberately left unspecified, so that different implementations can make different choices that make sense for their specific needs. For example, it would be stupid to tie object IDs to memory addresses, because implementations like JRuby, Opal, IronPython, MagLev, and Topaz run on platforms where the concept of "memory address" doesn't even exist! And Rubinius uses a moving garbage collector, where objects can move around in memory and thus their address changes.

Ruby object to_s what is the encoding of the object id?

2 Answers2

Linked

Related