While reviewing a codebase, I came upon a particular piece of code that triggered a warning regarding an "out of bounds access". After looking at the code, I could not see a way for the reported access to happen - and tried to minimize the code to create a reproducible example. I then checked this example with two commercial static analysers that I have access to - and also with the open-source Frama-C.
All 3 of them see the same "out of bounds" access.
I don't. Let's have a look:
3 extern int checker(int id);
4 extern int checker2(int id);
5
6 int compute(int *q)
7 {
8 int res = 0, status;
9
10 status = checker2(12);
11 if (!status) {
12 status = 1;
13 *q = 2;
14 for(int i=0; i<2 && 0!=status; i++) {
15 if (checker(i)) {
16 res = i;
17 status=checker2(i);
18 }
19 }
20 }
21 if (!status)
22 *q = res;
23 return status;
24 }
25
26 int someFunc(int id)
27 {
28 int p;
29 extern int data[2];
30
31 int status = checker2(132);
32 status |= compute(&p);
33 if (status == 0) {
34 return data[p];
35 } else
36 return -1;
37 }
Please don't try to judge the quality of the code, or why it does things the way it does. This is a hacked, cropped and mutated version of the original, with the sole intent being to reach a small example that demonstrates the issue.
All analysers I have access to report the same thing - that the indexing in the caller at line 34, doing the return data[p]
may read via the invalid index "2". Here's the output from Frama-C - but note that two commercial static analysers provide exactly the same assessment:
$ frama-c -val -main someFunc -rte why.c |& grep warning
...
why.c:34:[value] warning: accessing out of bounds index. assert p < 2;
Let's step the code in reverse, to see how this out of bounds access at line 34 can happen:
- To end up in line 34, the returned
status
from both calls tochecker2
andcompute
should be 0. - For
compute
to return 0 (at line 32 in the caller, line 23 in the callee), it means that we have performed the assignment at line 22 - since it is guarded at line 21 with a check forstatus
being 0. So we wrote in the passed-in pointerq
, whatever was stored in variableres
. This pointer points to the variable used to perform the indexing - the supposed out-of-bounds index. - So, to experience an out of bounds access into the
data
, which is dimensioned to contain exactly two elements, we must have written a value that is neither 0 nor 1 intores
. - We write into
res
via thefor
loop at 14; which will conditionally assign intores
; if it does assign, the value it will write will be one of the two valid indexes 0 or 1 - because those are the values that thefor
loop allows to go through (it is bound withi<2
). - Due to the initialization of
status
at line 12, if we do reach line 12, we will for sure enter the loop at least once. And if we do write intores
, we will write a nice valid index. - What if we don't write into it, though? The "default" setup at line 13 has written a "2" into our target - which is probably what scares the analysers. Can that "2" indeed escape out into the caller?
- Well, it doesn't seem so... if the
status
checks - at either line 11 or at line 21 fail, we will return with a non-zerostatus
; so whatever value we wrote (or didn't, and left uninitialised) into the passed-inq
is irrelevant; the caller will not read that value, due to the check at line 33.
So either I am missing something and there is indeed a scenario that leads to an out of bounds access with index 2 at line 34 (how?) or this is an example of the limits of mainstream formal verification.
Help?