1

I read this that PyObject has Type, Value and Reference count for garbage collection. But the following shows each integer object takes 32 bytes which for a 64-bit OS, there seems to be one more field. What would that be?

>>> hex(id(3))
'0x1595ae90130'
>>> hex(id(4))
'0x1595ae90150'
>>> hex(id(5))
'0x1595ae90170'  

You'll observe that the IDs are 32 bytes apart.

Leon Chang
  • 669
  • 8
  • 12

2 Answers2

5

The function id returns an address it doesn't inform you on the size of the object. The 32 bit difference you're seeing here 0x1595ae90150 - 0x1595ae90130 is not the size of the object. To get the size of an object you can use getsizeof in the sys module:

import sys
x = 5
print(sys.getsizeof(x))
# 28
0x263A
  • 1,807
  • 9
  • 22
  • 2
    They're right that it effectively "takes 32 bytes", though (due to the memory alignment). – Kelly Bundy Jul 29 '22 at 00:05
  • 1
    I would also note that as int in Python has no max value, the return of getsizeof may be larger. getsizeof(0) == 24. As you add digits, the size increases by 4 bytes every 9 digits, with one exception, where the 19 to 29 digits in the one 10 digit jump I found before a size increase, checking up to 256 digits. – nigh_anxiety Jul 29 '22 at 00:50
  • @nigh_anxiety It's 4 bytes for every digit (using base 2^30). – Kelly Bundy Jul 29 '22 at 01:08
  • @KellyBundy if you think there's something to add to this answer I wouldn't mind an edit. My knowledge of memory allocation in Python falls in the realm of "know more than most, less than some" – 0x263A Jul 29 '22 at 01:12
  • 1
    Well, as far as I know, typical Python aligns objects so that their address is always a multiple of 16, but I don't know whether/where that's documented, so I wouldn't feel comfortable putting it in an answer, let alone someone else's. – Kelly Bundy Jul 29 '22 at 01:31
0

From this post and code,

#define PyLong_SHIFT    30  for PYLONG_BITS_IN_DIGIT = 30

The value of an integer is equal to SUM(for i=0 through abs(ob_size)-1) ob_digit[i] * 2**(PyLong_SHIFT*i) where

struct _longobject {
    PyObject_VAR_HEAD
    uint32_t ob_digit[1];
};
#define PyObject_VAR_HEAD  \    // 28 bytes
    int ob_refcnt;       \
    struct _typeobject *ob_type;    \
    int ob_size; 

Below is a Python code to demonstrate the integer is of variable size and allocated 30 bits for each 4 bytes of memory.

from    decimal import Decimal

print("Check PyObject <int> fields and sizes 2**30, 60, 90 ==>>")
shift = 30
ob_size = 50
refCntObTypeSz = 8 * 3 # PyObject_VAR_HEAD 3 8-byte fields
def getSize(value):
    size = value.__sizeof__()
    maxValue = Decimal(2**((size - refCntObTypeSz)*8))  # 32-bit based
    ratio = value / maxValue
    print(f'i = {i:2d} : allocated size {size:3d} bit_length {value.bit_length():4d} value {hex(value)} =>\tactual/allocated = {ratio:0.3e}')
    return size
zero = 0
prev = zero.__sizeof__() #  24
for i in range(0, ob_size):
    value = (1 << (shift * i)) - 1 # 30-bit based
    nxt = getSize(value)
    if (nxt != prev):
        print(f"A gap of prev size {prev} != next size {nxt} bytes jump")
    value += 1
    nxt = getSize(value)
    if (nxt != prev + 4):
        print(f"An increment of != 4 bytes: prev size {prev} and next size {nxt}")
    prev = nxt

output:

Check PyObject <int> fields and sizes 2**30, 60, 90 ==>>
i =  0 : allocated size  24 bit_length    0 value 0x0 =>    actual/allocated = 0.000e+3
i =  0 : allocated size  28 bit_length    1 value 0x1 =>    actual/allocated = 2.328e-10
i =  1 : allocated size  28 bit_length   30 value 0x3fffffff => actual/allocated = 2.500e-1
i =  1 : allocated size  32 bit_length   31 value 0x40000000 => actual/allocated = 5.821e-11
i =  2 : allocated size  32 bit_length   60 value 0xfffffffffffffff =>  actual/allocated = 6.250e-2
i =  2 : allocated size  36 bit_length   61 value 0x1000000000000000 => actual/allocated = 1.455e-11
i =  3 : allocated size  36 bit_length   90 value 0x3ffffffffffffffffffffff =>  actual/allocated = 1.562e-2
i =  3 : allocated size  40 bit_length   91 value 0x40000000000000000000000 =>  actual/allocated = 3.638e-12
Leon Chang
  • 669
  • 8
  • 12
  • Doesn't make much sense to test with a wrong base. – Kelly Bundy Jul 30 '22 at 03:21
  • For the final question, you can use `dir(x)` to show all the properties of `x`. That said, follow-up questions should really be posted in a new question, not included in your answer. – John Kugelman Jul 30 '22 at 05:53
  • From the dir(x) list, I cannot find any attributes related to Type, Reference count or value field. I found `x.__sizeof__()`, `x.bit_length()` worked. That's about it. Any idea? – Leon Chang Jul 31 '22 at 04:01