Suppose I have a short array v
of say 8 int64_t
. I have an algorithm that needs to access different elements of that array, which are not compile-time constants, e.g. something like v[(i + j)/2] += ...
in which i
and j
are variables not subject to any kind of constant propagation.
Ordinarily I’d keep the array I memory, calculate the array index, load the array from memory in that position, and then store the result.
But suppose that, for valid reasons which I won’t go into, I want to keep the full array in registers -- the array is size-limited and fits the register bank.
If I were just reading from, and not writing to, the array, I could use (in ARMv8 NEON) the TBL instruction to perform table lookups. But what about the case of writing?
All I can think of is self-modifying code, encoding the array index directly into the instructions and executing it. I know this carries performance penalties when first running, but it might even work if the same code were executed over and over again.
Other than that, any ideas? Is it even possible? I reviewed the parts relevant to the instruction set and encoding of the ARMv8 architecture reference manual, and so far I’m inclined to say no, but maybe someone knows an obscure instruction or addressing mode that would help here.