I've been fiddling with clone
calls, and I noticed three different outcomes for different child thread stack allocations. The following demo allocates a stack n-bytes big where n is passed as an argument, then attempts to clone.
foo.c:
#define _GNU_SOURCE
#include <stdlib.h>
#include <unistd.h>
#include <sched.h>
#include <errno.h>
int child(void *arg)
{
(void)arg;
write(STDOUT_FILENO, "carpe momentum\n", 15);
return 0;
}
int main(int argc, char **argv)
{
long stacksize;
pid_t pid;
void *stack;
if (argc < 2)
return 1;
errno = 0;
stacksize = strtol(argv[1], NULL, 0);
if (errno != 0)
return 1;
stack = malloc(stacksize);
if (stack == NULL)
return 1;
pid = clone(child, stack + stacksize, 0, NULL);
if (pid == -1)
return 1;
write(STDOUT_FILENO, "success\n", 8);
return 0;
}
Here are my observations:
$ cc -o foo foo.c
$ ./foo 0
Segmentation fault
$ ./foo 23
Segmentation fault
$ ./foo 24
success
$ ./foo 583
success
$ ./foo 584
success
carpe momentum
$ ./foo 1048576 #1024 * 1024, amount suggested by man-page example
success
carpe momentum
All of the smattering of samples between 0 and 23 segfaulted, and for all of the samples between 24 and 583 the parent succeeded but the child was silent. Anything reasonable above 584 causes both to succeed.
Disassembly suggests that child
only uses 16 bytes of stack space, plus at least 16 more to call write
. But that's already more than the 24 bytes needed to stop segfaulting.
$ objdump -d foo
# ...
080484cb <child>:
80484cb: 55 push %ebp
80484cc: 89 e5 mov %esp,%ebp
80484ce: 83 ec 08 sub $0x8,%esp
80484d1: 83 ec 04 sub $0x4,%esp
80484d4: 6a 0f push $0xf
80484d6: 68 50 86 04 08 push $0x8048650
80484db: 6a 01 push $0x1
80484dd: e8 be fe ff ff call 80483a0 <write@plt>
80484e2: 83 c4 10 add $0x10,%esp
80484e5: b8 00 00 00 00 mov $0x0,%eax
80484ea: c9 leave
80484eb: c3 ret
# ...
This prompts several overlapping questions.
- Why doesn't
clone
segfault between 24 and 583 bytes of stack? - How does
child
fail silently with too little stack? - What is all that stack space used for?
- What is the significance of 24 and 584 bytes? How do they vary on different systems and implementations?
- Can I calculate a minimum stack requirement? Should I?
I am on an i686 Debian system:
$ uname -a
Linux REDACTED 3.16.0-4-686-pae #1 SMP Debian 3.16.7-ckt25-2+deb8u3 (2016-07-02) i686 GNU/Linux