In some project, I must access the machine code instructions of a function for debugging reasons.
My first approach (I decided to do it differently) was to convert a function pointer to the function to a data pointer using a union
like this:
void exampleCode(void)
{
volatile union {
uint8_t ** pptr;
void (* pFunc)(void);
} u;
uint8_t ** copyPptr;
uint8_t * resultPtr;
#if 1
u.pptr = (uint8_t **)0x10000000;
#endif
u.pFunc = &myFunction;
copyPptr = u.pptr;
/* The problem is here: */
resultPtr = copyPptr[109];
*resultPtr = 0xA5;
}
If I remove the line u.pptr = (uint32_t **)0x10000000
, the compiler assumes that copyPptr
two lines below has an undefined value (according to the C standard, converting function pointers to data pointers is not allowed).
For this reason, it "optimizes" the last two lines out (which is of course not wanted)!
However, the volatile
keyword means that the C compiler must not optimize away the read operation of u.pptr
in the line copyPptr = u.pptr
.
From my understanding, this also means that the C compiler must assume that it is at least possible that a "valid" result is read in this read operation, so copyPptr
contains some "valid" pointer.
(At least this is how I understand the answers to this question.)
Therefore, the compiler should not optimize away the last two lines.
Is this a compiler bug (GCC for ARM) or is this behavior allowed according to the C standard?
EDIT
As I have already written, I already have another solution for my problem.
I asked this question because I'm trying to understand why the union
solution did not work to avoid similar problems in the future.
In the comments, you asked me for my exact use case:
For debugging reasons, I need to access some static
variable in some library that comes as object code.
The disassembly looks like this:
.text
.global myFunction
myFunction:
...
.L1:
.word .theVariableOfInterest
.data:
# unfortunately not .global!
.theVariableOfInterest:
.word 0
What I'm doing is this:
I set copyPtr
to myFunction
. If K
is (.L1-myFunction)/4
, then copyPtr+K
points to .L1
and copyPtr[K]
points to .theVariableOfInterest
.
So I can access the variable using *(copyPtr[K])
.
From the disassembly of the object file, I can see that K
is 109.
All this works fine if copyPtr
contains the correct pointer.
EDIT 2
It's much simpler to reproduce the problem with gcc -O3
:
char a, c;
char * b = &a;
void testCode(void)
{
char * volatile x;
c = 'A';
*x = 'X';
}
It's true that uninitialized variables cause undefined behavior.
However, if I understand the description of volatile
in the draft "ISO/IEC 9899:TC3" of the C99 standard correctly, a volatile
variable would not be "uninitialized" if I stop the program in the debugger (using a breakpoint) at the line c = 'A'
and copy the value of b
to x
in the debugger.
On the other hand, some other statement was saying that using volatile
for local variables in current C++ (not C) versions already causes undefined behavior ...
Now I'm wondering what is true for modern C (not C++) versions.