This might seem an odd question, but I think it's relevant to certain x86 binary analysis hurdles. I'm pondering this idea: If I have a binary which reaches out to a remote server for a jump destination, but the server won't tell me the destination until December and I want to analyze the program's behavior in November, could I start jumping into random addresses within the program and reliably guess some correct instruction offsets? Would I potentially be able to "break into" the program's control flow structure and "bypass" the issue of that secret jump destination?
A security researcher, Chris Domas, demonstrated a program he calls Sand Sifter which attempts to fuzz x86 instructions: https://github.com/xoreaxeaxeax/sandsifter
I was thinking if invalid instructions are common, in other words if jumping into the middle of an instruction would often result in soon running into invalid instructions, could we find correct offsets?
Well that's probably far too nebulous to be a valid question, so I decided to simplify it: Roughly, do we have any estimate on what portion of all byte sequences are valid instructions? I suppose another way of asking might be: If I generated 1 million random entropy binary sequences, what percentage of them would have a valid execution path? Even that question is complicated by the length of the binary, since the longer the binary, the more likely to run into an invalid instruction eventually.