6

I can see lots of copied lore that functions registered with the .init_array section have the command-line arguments argc and argv, like main(), but I am failing to find any actual published documentation online that confirms that this is the case.

Yes, for clarity, the function itself is not "declared in" the .init_array, but a pointer to the function is declared there, "registering" the function, and it is called by some iterator during start-up. Question remains: show me some documentation for the argument list passed in by that iterator.

My intent is to change these arguments from a dynamic library in a subtle but generally safe way, so I want to find the "real deal" in memory - not from /proc/self/.

For more information, follow the link below.

Some Stack-overflow lore: Accessing main arguments outside of main on Linux

Even my favoured Oracle ( docs.oracle.com/cd/E23824_01/html/819-0690/chapter3-8.html ) only mentions that the functions get called, but no promise of what arguments there might be. Same with the elf and gcc documentation, as far as I can see.

In the land of C/C++ UB paranoia, Ideally I need some certainty that this is documented behaviour before I go ahead with it? Does it exist? Can it be implied in some way?


Summary of comments/answers so-far:

At least for GNU libc, a relevant change occurred with this patch: BZ #974. https://sourceware.org/pipermail/libc-alpha/2005-July/019240.html (It is mentioned in glibc's ChangeLog.old/ChangeLog.16 entry 2005-04-13 H.J. Lu.) – Ian Abbott

To me, this demonstrates that the glbc maintainers were aware of the requirement to pass argc/argv/env - that it is not accidental - and extended it to main exe registrations. It also tells us that it was working for dynamic libraries prior to that date.

It is an interesting question whether this binds other libc implementers to follow the pattern.

Nimantha
  • 6,405
  • 6
  • 28
  • 69
Gem Taylor
  • 5,381
  • 1
  • 9
  • 27
  • Looking at the (specific) code responsible for calling these function, I'd say they should be `void foo(void)`: http://static.grumpycoder.net/pixel/uC-sdk-doc/initfini_8c_source.html – Eugene Sh. Sep 24 '21 at 14:45
  • Yes, that is always safe as a function signature :-) (at least in C, not so in pascal) – Gem Taylor Sep 24 '21 at 14:46
  • At least for GNU libc, a relevant change occurred with this patch: [BZ #974](https://sourceware.org/pipermail/libc-alpha/2005-July/019240.html). (It is mentioned in glibc's ChangeLog.old/ChangeLog.16 entry 2005-04-13 H.J. Lu.) – Ian Abbott Sep 24 '21 at 17:08
  • Thanks @IanAbbott That is the sort of strong evidence I want/need. – Gem Taylor Sep 24 '21 at 17:15
  • From my perspective I want to get at the arguments from a dynamic library, so the fix in 2005 to the main exe is not a worry. – Gem Taylor Sep 24 '21 at 17:19
  • @GemTaylor Oh, then for "elf/dl-init.c" the passing of arguments and environment variables to functions specified by DT_INIT, DT_INIT_ARRAY (and also DT_PREINIT_ARRAY, but that is ignored for shared objects) was implemented on 2000-03-30 by Ulrich Drepper (see ChangeLog.old/ChangeLog.11 line 12226 ff.). – Ian Abbott Sep 24 '21 at 18:03
  • @IanAbbott It is a pity about the DT_PREINIT_ARRAY not being available for shared objects. In the end, registering in DT_INIT_ARRAY meant the call came to our code too late to be very useful. – Gem Taylor Sep 27 '21 at 11:35
  • 2
    https://maskray.me/blog/2021-11-07-init-ctors-init-array – Arto Bendiken Jul 28 '22 at 10:38
  • 1
    Thanks @ArtoBendiken - that looks like an interesting summary. – Gem Taylor Jul 28 '22 at 16:22

1 Answers1

5

I've found this interesting article about Linux programs' start-up procedure by Patrick Horgan. But I may not guarantee the correctness of this source.

At least, it explains the code behind the .init_array section:

void __libc_csu_init (int argc, char **argv, char **envp) {
  _init ();

  const size_t size = __init_array_end - __init_array_start;
  for (size_t i = 0; i < size; i++) {
      (*__init_array_start [i]) (argc, argv, envp);
  }
}

It appears that __libc_csu_init() function first calculates the number of elements inside .init_array section, and then calls every function pointer with arguments argc, argv and envp. This function (__libc_csu_init()) is called before main().

NOTE: the .init_array section appears to be specific to the ELF binary format.


Update

It appears that the implementation of __libc_csu_init() (and, more in general, how .init_array functions are called) is platform-dependent and libc-dependent.

However, GLIBC on Linux appears to correctly call the functions with the desired arguments, as you can see from its source code.

Plus, reading the GLIBC changelog, it appears that this behavior has been introduced in 2005.

Luca Polito
  • 2,387
  • 14
  • 20
  • Thanks Luca. That is a useful and readable tutorial. And at least it explains the difference between .init_array and .preinit_array – Gem Taylor Sep 24 '21 at 16:51
  • I was trying to work out how ancient it was... the asm examples are 32 bit, so that dates it a bit. Page has (C) 2011 so not that old! – Gem Taylor Sep 24 '21 at 17:02
  • 1
    @GemTaylor It seems to date from 2005-04-13 for GNU libc. Not sure about other libc implementations. – Ian Abbott Sep 24 '21 at 17:13
  • 1
    I've updated my answer with some additional info. @IanAbbott That's correct. – Luca Polito Sep 24 '21 at 17:15