2

I'm trying to understand how memory is managed in the GMP project. For reference, the GMP documentation gives the following example for how their mpz_t type should be declared, initialised and cleared:

{
  mpz_t integ;
  mpz_init (integ);
  …
  mpz_add (integ, …);
  …
  mpz_sub (integ, …);

  /* Unless the program is about to exit, do ... */
  mpz_clear (integ);
}

Now, mpz_t is defined as follows:

typedef struct
{
  int _mp_alloc;    /* Number of *limbs* allocated and pointed
                   to by the _mp_d field.  */
  int _mp_size;     /* abs(_mp_size) is the number of limbs the
                   last field points to.  If _mp_size is
                   negative this is a negative number.  */
  mp_limb_t *_mp_d; /* Pointer to the limbs.  */
} __mpz_struct;

typedef __mpz_struct *mpz_ptr;
typedef __mpz_struct mpz_t[1];

and mpz_init is defined as:

void
mpz_init (mpz_ptr x) __GMP_NOTHROW
{
  static const mp_limb_t dummy_limb=0xc1a0;
  ALLOC (x) = 0;
  PTR (x) = (mp_ptr) &dummy_limb;
  SIZ (x) = 0;
}

Initially, I was confused as to why there are no mallocs in mpz_init. But after staring at the typedef for mpz_t for a while, my understanding is that the space for mpz_t is being stack-allocated, rather than heap-allocated (that is, it is an array of size 1, so it is effectively a stack-allocated pointer to the __mpz_struct). Having seen this question, the suggestion is that using an array of size 1 is an ISO C90 idiom, and the top answer of this question seems to imply that we shouldn't go around doing this. Moreover, to me, this seems like a "too-good-to-be-true" way of getting out of manual memory allocation.

Questions

  1. Why would you use this idiom? What benefits does it hold over mallocing? What disadvantages does it have?
  2. Is it a historical thing? In C11/C23, would you do something else instead?
steeps
  • 65
  • 6
  • I suspect the reason is to avoid casual copying by assignment. If you have `mpz_t x;` and `mpz_t y;` then `x = y;` would result in a compiler error. As described in the manual under "Variable Conventions": "_For both behavior and efficiency reasons, it is discouraged to make copies of the GMP object itself (either directly or via aggregate objects containing such GMP objects). If copies are done, all of them must be used read-only; using a copy as the output of some function will invalidate all the other copies._" – Ian Abbott Feb 12 '21 at 17:59
  • Slightly older versions of GMP had a small malloc in mpz_init. But this it too early to do a useful malloc, we don't have any idea what size will be needed. mpz_init_set, which copies an integer, has a more useful malloc. – Marc Glisse Feb 12 '21 at 18:32
  • I don't think recent versions of the C standard had any changes relevant to this design with an array. C++ interfaces look very different though. – Marc Glisse Feb 12 '21 at 18:35

1 Answers1

0

Defining:

typedef __mpz_struct *mpz_ptr;
typedef __mpz_struct mpz_t[1];

make both equivalent to pointer (array decays to pointer), and thus ease of both types using the same pattern (no & to "reference" structs). Then passing both to function would probably looks like f(v) whatever v is pointer or struct.

Jean-Baptiste Yunès
  • 34,548
  • 4
  • 48
  • 69