In order to analyze the possible sources of some value, it's a good idea to turn all variables into immutables by introducing a new symbol whenever the original was changed and using the new symbol for all following occurences (the original symbol won't be used after the point where it was re-assigned in the original code).
Consider the following code:
// control flow block 1
int i = 1;
if (some_condition()) {
// control flow block 2
i = 2;
}
// control flow block 3
int j = i;
With the control flow graph
[1]
| \ <- if (some_condition())
| [2]
| / <- join of control flow after the if block ends
[3]
You could write a list of all symbols that are alive (have a value that is used anywhere later in the control flow graph) at the entry and exit point of a block in the control flow graph:
[1] entry: nothing; exit: i
[2] entry: nothing; exit: i
[3] entry: i; exit: i, j (I assume i, j are re-used after the end of this example)
Notice that [2] entry
is empty, since i
is never read and always written within block [2]
. The problem with this representation is, that i
is in the exit list of all blocks but it has different possible values for each block.
So, lets introduce the immutable symbols in pseudo-code:
// control flow block 1
i = 1;
if (some_condition()) {
// control flow block 2
i_1 = 2;
}
// control flow block 3
// join-logic of predecessor [1] and [2]
i_2 = one_of(i, i_1);
j = i_2;
Now every variable is coupled exactly to its first (and only) assignment. Meaning, a dependency graph can be constructed by analyzing the symbols that are involved in an assignment
i -> i_2
i_1 -> i_2
i_2 -> j
Now in case there is any constraint on the allowed value of j
, a static checker could require that all predecessors of j
(namely i_2
, in turn originating from i
and i_1
), satisfy this requirement.
In case of function calls, the dependency graph would contain an edge from every calling argument to the corresponding parameter in the function definition.
Applying this to your example is straight forward if we only focus on the array variable and ignore changes to the array content (I'm not quite sure to what extent a static checker would track the content of individual array items in order to find danger down the road):
Example 1:
void f1(char *s)
{
s[20] = 0;
}
void f2()
{
char a[10];
if (x + y == 2) {
f1(a);
}
}
Transforms to
f1(s)
{
s[20] = 0;
}
f2()
{
a = char[10];
if (x + y == 2) {
call f1(a);
}
}
With dependency graph including the passed arguments via function call
a -> s
So it's immediately clear that a
has to be considered for the static analysis of the safety of s[20]
.
Example 2:
void f1(char *s)
{
char str_arry[30];
s= str_arry;
s[20] = 0;
}
Transforms to
f1(s)
{
// control flow block 1
str_arry = char[30];
s_1 = str_arry;
s_1[20] = 0;
}
With dependency graph
str_arry -> s_1
So it's immediately clear that the only value to be considered for the static analysis of the safety of s_1[20]
is str_arry
.