Harvard architecture computers have separate code and data memories. Does this make them immune to code injection attacks (as data cannot be executed as code)?
6 Answers
They are somewhat more immune than Von Neumann architecture, but not entirely. Every architecture has a conversion point, where data starts being treated as code. In Von Neumann it happens immediately inside CPU, while in Harvard it happens before the memory is reserved and declared for the module (or sometimes even before that, when a file is being prepared by the build system). This means that in Harvard architecture a successful code injection attack needs to be a bit more complicated and far-fetching, but not necessarily impossible.
If one can place a file containing malicious code in the machine's storage (e.g. file system) and cause, say, a buffer overrun which would redirect on return to existing (valid, non-malicious) code which loads this malicious file as code and if the architecture allows this file to start executing (e.g. via self-initialization routine), that'll be an example of successful code injection.

- 4,129
- 23
- 18
-
Here's a thought that could help mitigate insecurity at the conversion point: code can only be loaded into code memory by a separate system (preferably hardware) which validates code signatures. Then the potential insecurity becomes: what avenue(s) might be available for modifying the manner in which code signatures are checked? For instance, if private key(s) were compromised, the code signature validation system would need to be updated with new key(s). This update mechanism would be a potential security weakness. Ideally it would be something external and physical (like a smart card). – Kevin H. Patterson Jun 13 '20 at 02:34
-
Also note that code which functions as an interpreter or JIT compiler would still be subject to code injection attacks within the interpreter / JIT VM, even on a Harvard architecture. Then the problem becomes more of sandbox security. – Kevin H. Patterson Jun 13 '20 at 02:36
It partly depends on what you count as a "code injection attack".
Take a SQL injection attack, for example. The SQL query itself would never need to be in an executable part of memory, because it's converted into native code (or interpreted, or whatever terminology you wish to use) by the database engine. However, that SQL could still be broadly regarded as "code".
If you only include an attacker inserting native code to be executed directly by the processor (e.g. via a buffer overrun), and if the process is prevented from copying data into a "code area", then it provides protection against this sort of attack, yes. (I'm reluctant to claim 100% protection even if I can't think of any attack vectors; it sounds foolproof, but security's a tricky business.)

- 55
- 5

- 1,421,763
- 867
- 9,128
- 9,194
Apparently, there are some researchers who were able to accomplish a permanent code injection attack on a Harvard architecture. So maybe not as secure as people thought.

- 27,892
- 14
- 72
- 91
-
1This is where it pays to be careful, the first sentence is "Harvard architecture CPU design is common in the embedded world." If you study the actual design of the Harvard Mark I, you realize quickly that it wasn't the CPU design that was important, it was the strict physical separation between instructions and data--from their respective stores until execution, which doesn't exist in any current architecture that I'm aware of. That paper addresses "Modified Harvard Architecture." – avgvstvs Nov 30 '13 at 02:21
-
2@avgvstvs: some DSPs or microcontrollers really are Harvard, with separate busses for instruction memory and data. The idea is that they run code from ROM, and have some scratch RAM. Even [ARM9 apparently had separate external busses](https://en.wikipedia.org/wiki/ARM9) (which could be connected in a way that creates a modified Harvard, rather than Harvard). If going between data and instruction memory requires writing to disk and re-reading into the other memory, I think most people would call that Harvard, not modified Harvard. (Although modified Harvard includes more than just split L1) – Peter Cordes Nov 19 '16 at 21:00
-
1Again, I go back to the fact that the Mark I wasn't simply separated at the bus, it was complete separation back to separate stores for data and instructions. Maybe most people *would* consider a shared ROM still "Harvard," but I don't. The reason that the attack in question was able to succeed is because the ROM was shared. – avgvstvs Nov 21 '16 at 14:17
Most Harvard architecture machines still use a common shared memory space for both data and instructions outside of the core. So, it would still be possible to inject code and get it executed as instructions. In fact, most processors today are internally Harvard architecture, even if they look Von Neumann externally.

- 3,128
- 18
- 19
-
1Modern mainstream machines (like x86, ARM/AArch64, etc) are "modified Harvard" (e.g. [split L1d and L1i caches](https://stackoverflow.com/q/55752699)) to speed up a CPU that as you say behaves as a von Neumann. https://en.wikipedia.org/wiki/Modified_Harvard_architecture. It's a stretch to say "most processors are Harvard", unless you're including microcontrollers with separate ROM and RAM which actually are proper Harvard, needing separate instructions to read program memory. x86 even has coherent I-cache so the Harvardness is completely invisible. (But most RISCS like AArch64 don't.) – Peter Cordes Dec 11 '20 at 14:18
My university had a MS defense recently that discussed this very thing. Unfortunately, I wasn't able to attend. I'm sure if you contacted Mr. Watts he'd be willing to discuss it.

- 39,638
- 28
- 112
- 212
x86 has a segmentation architecture that does this, and it has been used by some projects to try to stop data from being executed as code (an effort which is now mostly wasted given the NX bit), and it never came close to stemming the flow of new exploits. Consider the amazing number of remote file inclusions still exploitable in the wild.

- 617
- 4
- 2
-
2
-
Hardware NX (with full W^X mapping choices) does stop *code injection* attacks. That's why exploits against modern software typically have to be ROP attacks that take over control flow to get sequences of code (gadgets) *already in the binary* executed in a specific order. Even a pure Harvard machine that can only execute code from truly read-only ROM could be vulnerable to that. But it wouldn't be a code injection attack. – Peter Cordes Dec 11 '20 at 14:14