15

I am working on some legacy C code. The original code was written in the mid-90s, targeting Solaris and Sun's C compiler of that era. The current version compiles under GCC 4 (albeit with many warnings), and it seems to work, but I'm trying to tidy it up -- I want to squeeze out as many latent bugs as possible as I determine what may be necessary to adapt it to 64-bit platforms, and to compilers other than the one it was built for.

One of my main activities in this regard has been to ensure that all functions have full prototypes (which many did not have), and in that context I discovered some code that calls a function (previously un-prototyped) with fewer arguments than the function definition declares. The function implementation does use the value of the missing argument.

Example:

impl.c:

int foo(int one, int two) {
  if (two) {
      return one;
  } else {
      return one + 1;
  }
}

client1.c:

extern foo();
int bar() {
  /* only one argument(!): */
  return foo(42);
}

client2.c:

extern int foo();
int (*foop)() = foo;
int baz() {
  /* calls the same function as does bar(), but with two arguments: */
  return (*foop)(17, 23);
}

Questions: is the result of a function call with missing arguments defined? If so, what value will the function receive for the unspecified argument? Otherwise, would the Sun C compiler of ca. 1996 (for Solaris, not VMS) have exhibited a predictable implementation-specific behavior that I can emulate by adding a particular argument value to the affected calls?

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
  • I don't suppose you have access to a Sun box with a 1996-ish Solaris installation, do you? It would be great to blackbox it and see what's going on. – paddy Jul 11 '13 at 22:01
  • 1
    If the missing arguments have no adverse effects on the result of the function, why not fill them in with blanks? 0 or `NULL` etc. – Kninnug Jul 11 '13 at 22:02
  • 3
    Any chance you've overlooked a preprocessor macro? – Robert Harvey Jul 11 '13 at 22:02
  • 1
    @kninnug: That's not really a viable option without understanding why this is happening. – Robert Harvey Jul 11 '13 at 22:03
  • Possible duplicate of http://stackoverflow.com/questions/5929711/c-function-with-no-parameters-behavior – Jimbo Jul 11 '13 at 22:14
  • @paddy: no, I do not have hardware or compiler/OS with which to test behavior in the original target environment. – John Bollinger Jul 12 '13 at 13:24
  • @kninnug: the function uses the value of the sometimes-not-passed argument. I am trying to determine whether in the original target environment it works reliably by leveraging implementation-specific behavior (which I can then emulate) or whether it fails sometimes (in which case I have to figure out what the desired behavior was). – John Bollinger Jul 12 '13 at 13:28
  • @Robert Harvey: good question, but as far as I can tell, no, I have not overlooked a macro. – John Bollinger Jul 12 '13 at 13:32
  • 1
    @Jimbo: the question you refer touches the same topics, but the case is opposite: parameters passed in a function call, when the function definition doesn't declare any. I can only wish that was my problem. – John Bollinger Jul 12 '13 at 13:36

4 Answers4

5

EDIT: I found a stack thread C function with no parameters behavior which gives a very succinct and specific, accurate answer. PMG's comment at the end of the answer taks about UB. Below were my original thoughts, which I think are along the same lines and explain why the behaviour is UB..

Questions: is the result of a function call with missing arguments defined?

I would say no... The reason being is that I think the function will operate as-if it had the second parameter, but as explained below, that second parameter could just be junk.

If so, what value will the function receive for the unspecified argument?

I think the values received are undefined. This is why you could have UB.

There are two general ways of parameter passing that I'm aware of... (Wikipedia has a good page on calling conventions)

  1. Pass by register. I.e., the ABI (Application Binary Interface) for the plat form will say that registers x & y for example are for passing in parameters, and any more above that get passed via stack...
  2. Everything gets passed via stack...

Thus when you give one module a definition of the function with "...unspecified (but not variable) number of parameters..." (the extern def), it will not place as many parameters as you give it (in this case 1) in either the registers or stack location that the real function will look in to get the parameter values. Therefore the second area for the second parameter, which is missed out, essentially contains random junk.

EDIT: Based on the other stack thread I found, I would ammended the above to say that the extern declared a function with no parameters to a declared a function with "unspecified (but not variable) number of parameters".

When the program jumps to the function, that function assumes the parameter passing mechanism has been correctly obeyed, so either looks in registers or the stack and uses whatever values it finds... asumming them to be correct.

Otherwise, would the Sun C compiler of ca. 1996 (for Solaris, not VMS) have exhibited a >> predictable implementation-specific behavior

You'd have to check your compiler documentation. I doubt it... the extern definition would be trusted completely so I doubt the registers or stack, depending on parameter passing mechanism, would get correctly initialised...

Community
  • 1
  • 1
Jimbo
  • 4,352
  • 3
  • 27
  • 44
  • Thanks. It looks like the SPARC ABI specifies pass by register for up to six parameters, so that's what the code was originally dealing with. I haven't yet turned up compiler documentation, but I'm inclined to agree that the compiler is unlikely to modify registers that don't correspond (it thinks) to function arguments. – John Bollinger Jul 12 '13 at 14:08
  • I know! I'll just call random() to get a value for the second argument! :) – John Bollinger Jul 12 '13 at 14:10
4

If the number or the types of arguments (after default argument promotions) do not match the ones used in the actual function definition, the behavior is undefined.

What will happen in practice depends on the implementation. The values of missing parameters will not be meaningfully defined (assuming the attempt to access missing arguments will not segfault), i.e. they will hold unpredictable and possibly unstable values.

Whether the program will survive such incorrect calls will also depend on the calling convention. A "classic" C calling convention, in which the caller is responsible for placing the parameters into the stack and removing them from there, will be less crash-prone in presence of such errors. The same can be said about calls that use CPU registers to pass arguments. Meanwhile, a calling convention in which the function itself is responsible for cleaning the stack will crash almost immediately.

AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
  • Thanks, that's pretty much what I thought. It looks like the SPARC ABI calls for the first six arguments to be passed in registers, but I don't find anything about touching registers from that group that do not correspond to arguments. Probably, then, the called function gets random junk if the caller doesn't specify all the arguments. – John Bollinger Jul 12 '13 at 14:02
1

It is very unlikely the bar function ever in the past would give consistent results. The only thing I can imagine is that it is always called on fresh stack space and the stack space was cleared upon startup of the process, in which case the second parameter would be 0. Or the difference between between returning one and one+1 didn't make a big difference in the bigger scope of the application.

If it really is like you depict in your example, then you are looking at a big fat bug. In the distant past there was a coding style where vararg functions were implemented by specifying more parameters than passed, but just as with modern varargs you should not access any parameters not actually passed.

Bryan Olivier
  • 5,207
  • 2
  • 16
  • 18
  • It really is very much like I depict. The value of the sometimes-not-passed parameter is tested in the control expression of a conditional in the called function (but it is otherwise unused). I agree that it looks like a bug, but I wanted to (1) confirm that, and (2) get some guidance on what might be the correct fix. – John Bollinger Jul 12 '13 at 13:54
  • @JohnBollinger (1) I think we all agree it is a bug. (2) I'm afraid you'll have to fix the bug by understanding what the value of the missing parameter should be. – Bryan Olivier Jul 12 '13 at 14:21
1

I assume that this code was compiled and run on the Sun SPARC architecture. According to this ancient SPARC web page: "registers %o0-%o5 are used for the first six parameters passed to a procedure."

In your example with a function expecting two parameters, with the second parameter not specified at the call site, it is likely that register %01 always happened to have a sensible value when the call was made.

If you have access to the original executable and can disassemble the code around the incorrect call site, you might be able to deduce what value %o1 had when the call was made. Or you might try running the original executable on a SPARC emulator, like QEMU. In any case this won't be a trivial task!

markgz
  • 6,054
  • 1
  • 19
  • 41
  • Thanks. I also found this: http://www.sparc.com/standards/FpsABI3rd.pdf, which contains a more detailed (and more technical) description of the ABI. It's consistent with what you said. – John Bollinger Jul 12 '13 at 13:46