Are there any smart cases of runtime code modification?

Question

Can you think of any legitimate (smart) uses for runtime code modification (program modifying it's own code at runtime)?

Modern operating systems seem to frown upon programs that do this since this technique has been used by viruses to avoid detection.

All I can think of is some kind of runtime optimization that would remove or add some code by knowing something at runtime which cannot be known at compile time.

On modern architectures, it interferes badly with caching and the instruction pipeline: self modifying code would end up not modifying the cache, so you would need barriers, and this would likely makes your code slow. And you cannot modify code which is already in the instruction pipeline. So any optimization based on self modifying code has to be performed way before the code is run to have a perfomance impact superior to, say, a runtime check. — Alexandre C., Apr 04 '11 at 07:41
@Alexandre: it's common for self-modifying code to make modifications vary rarely (e.g. once, twice) despite being executed an arbitrary number of times, so the one-off cost can be insignificant. — Tony Delroy, Apr 04 '11 at 07:43
@Tony: yes, otherwise this would defeat the optimization purpose. My point is that you cannot self modify code *arbitrarily close* to the code being run, so this limits what one can do. — Alexandre C., Apr 04 '11 at 07:48
Not sure why this is tagged C or C++, since neither has any mechanism for this. — MSalters, Apr 04 '11 at 08:12
@MSalters At least gcc has the possibility that you can make labels in C or C++ code and then ask for their address (of the code line) with non-standard operator &&labelName. You can then make a program that uses the address received this way to overwrite the code. Maybe this is not the best way to do anything feasible, but just an example that it can be done by using c/c++. — deo, Apr 04 '11 at 08:22
@deo: You don't need gcc to do this. Most versions of C allow function pointers; all you have to beleive is that the function pointer actually points to the compiled function code, and you can patch things. I think if you do that, your program is illegal, tho. — Ira Baxter, Apr 04 '11 at 09:13
@Alexandre: for GCC, there's void __builtin___clear_cache (char *begin, char *end) to invalidate the instruction cache: this is intended to provide guaranteed deterministic behaviour when "arbitrarily close" (http://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#Other-Builtins). — Tony Delroy, Apr 04 '11 at 09:40
@Alexandre: Microsoft Office is known to do exactly that. As a consequence (?) all x86 processors have excellent support for self modifying code. On other processors costly synchronization is necessary which makes the whole thing less attractive. — Mackie Messer, Apr 04 '11 at 11:16
@Ira Baxter: Function/data pointer casts used to be illformed, diagnostic required. It's since been changed to something like "conditionally supported", to support `dlsym` and `GetProcAddress` without warnings. — MSalters, Apr 04 '11 at 13:41
I wonder, don't self / auto updating and upgrading modify its own code at runtime? — cregox, Apr 05 '11 at 00:33
@Chris taking a risk in asking the newbish question: what's the difference? — cregox, Apr 06 '11 at 03:56
@Cawas: Usually auto updating software will download new assemblies and/or executables and overwrite the existing ones. Then it will restart the software. This is what firefox, adobe, etc do. Self modifying typically means that during runtime code is rewritten in memory by the application due to some parameters and not necessarily persisted back to disk. For example, it might optimize out whole code paths if it can intelligently detect those paths would not be exercised during this particular run in order to speed execution. — NotMe, Apr 06 '11 at 15:57
@MackieMesser how do you know that MS Office uses self modifying code? — phuclv, Nov 09 '18 at 08:19

Mackie Messer · Accepted Answer · 2011-04-05T11:57:00.453

There are many valid cases for code modification. Generating code at run time can be useful for:

Some virtual machines use JIT compilation to improve performance.
Generating specialized functions on the fly has long been common in computer graphics. See e.g. Rob Pike and Bart Locanthi and John Reiser Hardware Software Tradeoffs for Bitmap Graphics on the Blit (1984) or this posting (2006) by Chris Lattner on Apple's use of LLVM for runtime code specialization in their OpenGL stack.
In some cases software resorts to a technique known as trampoline which involves the dynamic creation of code on the stack (or another place). Examples are GCC's nested functions and the signal mechanism of some Unices.

Sometimes code is translated into code at runtime (this is called dynamic binary translation):

Emulators like Apple's Rosetta use this technique to speed up emulation. Another example is Transmeta's code morphing software.
Sophisticated debuggers and profilers like Valgrind or Pin use it to instrument your code while it is being executed.
Before extensions were made to the x86 instruction set, virtualization software like VMWare could not directly run privileged x86 code inside virtual machines. Instead it had to translate any problematic instructions on the fly into more appropriate custom code.

Code modification can be used to work around limitations of the instruction set:

There was a time (long ago, I know), when computers had no instructions to return from a subroutine or to indirectly address memory. Self modifying code was the only way to implement subroutines, pointers and arrays.

More cases of code modification:

Many debuggers replace instructions to implement breakpoints.
Some dynamic linkers modify code at runtime. This article provides some background on the runtime relocation of Windows DLLs, which is effectively a form of code modification.

This list seems to intermix examples of code which modifies itself, and code which modifies other code, like linkers. — AShelly, Apr 04 '11 at 15:06
@AShelly: Well, if you consider the dynamic linker/loader to be a part of the code, then it does modify itself. They live in the same address space, so I think that is a valid point of view. — Mackie Messer, Apr 04 '11 at 15:23
Ok, the list now distinguishes between programs and system software. I hope this makes sense. In the end any classification is debatable. It all comes down to what exactly you include into the definition of program (or code). — Mackie Messer, Apr 05 '11 at 00:25

score 35 · Answer 2 · answered Apr 04 '11 at 07:22

35

This has been done in computer graphics, specifically software renderers for optimization purposes. At runtime the state of many parameters is examined and an optimized version of the rasterizer code is generated (potentially eliminating a lot of conditionals) which allows one to render graphics primitives e.g. triangles much faster.

answered Apr 04 '11 at 07:22

trenki

7,133
7
49
61

5

An interesting read is Michael Abrash's 3-Part Pixomatic articles on DDJ: http://drdobbs.com/architecture-and-design/184405765, http://drdobbs.com/184405807, http://drdobbs.com/184405848. The second link (Part2) talks about the Pixomatic code welder for the pixel pipeline. – typo.pl Apr 04 '11 at 07:55
1

A very nice article on the topic. From 1984, but still a good read: Rob Pike and Bart Locanthi and John Reiser. [Hardware Software Tradeoffs for Bitmap Graphics on the Blit](http://research.google.com/people/r/index.html). – Mackie Messer Apr 04 '11 at 08:52
5

Charles Petzold explains one example of this kind in a book titled "Beautiful Code" : http://www.amazon.com/Beautiful-Code-Leading-Programmers-Practice/dp/0596510047 – Nawaz Apr 04 '11 at 10:13
3

This answer talks about *generating* code, but the question is asking about *modifying* code... – Timwi Apr 04 '11 at 12:44
3

@Timwi - it did modify code. Rather than handling a big chain of if's it parsed the shape once and rewrote the renderer so it was setup for the correct type of shape without having to check everytime. Interestingly this is now common with opencl code - since it's compiled on the fly you can rewrite it for the specific case at runtime – Martin Beckett Apr 04 '11 at 17:48

flolo · Answer 3 · 2011-04-04T07:38:22.657

23

One valid reason is because the asm instruction set lack some necessary instruction, which you could build yourself. Example: On x86 there is no way to create an interrupt to a variable in a register (e.g. make interrupt with interrupt number in ax). Only const numbers coded into the opcode were allowed. With selfmodifying code one could emulate this behaviour.

edited Apr 04 '11 at 07:38

answered Apr 04 '11 at 07:31

flolo

15,148
4
32
57

Fair enough. Is there any use of this technique ? It seems dangerous. – Alexandre C. Apr 04 '11 at 07:47
4

@Alexandre C.: If I remember right many runtime libraries (C, Pascal,...) had to DOS times a function to perform Interrupt calls. As such a functions gets the interrupt number as parameter you had to supply such a function (of course if the number was constant you could have generated the right code, but that was not guaranteed). And all the libraries implemented it with selfmodifying code. – flolo Apr 04 '11 at 11:20
You can use a switch case to do it without code modification. The downsize is that the output code will be larger – phuclv Nov 14 '13 at 02:36

Tony Delroy · Answer 4 · 2013-11-14T01:46:46.070

There are many cases:

Viruses commonly used self-modifying code to "deobfuscate" their code prior to execution, but that technique can also be useful in frustrating reverse engineering, cracking and unwanted hackery
In some cases, there can be a particular point during runtime (e.g. immediately after reading the config file) when it is known that - for the rest of the lifetime of the process - a particular branch will always or never be taken: rather than needlessly checking some variable to determine which way to branch, the branch instruction itself could be modified accordingly
- e.g. It may become known that only one of the possible derived types will be handled, such that virtual dispatch can be replaced with a specific call
- Having detected which hardware is available, use of a matching code may be hardcoded
Unnecessary code can be replaced with no-op instructions or a jump over it, or have the next bit of code shifted directly into place (easier if using position-independent opcodes)
Code written to facilitate its own debugging might inject a trap/signal/interrupt instruction expected by the debugger at a strategic location.
Some predicate expressions based on user input might be compiled into native code by a library
Inlining some simple operations that aren't visible until runtime (e.g. from dynamically loaded library)...
Conditionally adding self-instrumentation/profiling steps
Cracks may be implemented as libraries that modify the code that loads them (not "self" modifying exactly, but needs the same techniques and permissions).
...

Some OSs' security models mean self-modifying code can't run without root/admin privileges, making it impractical for general-purpose use.

From Wikipedia:

Application software running under an operating system with strict W^X security cannot execute instructions in pages it is allowed to write to—only the operating system itself is allowed to both write instructions to memory and later execute those instructions.

On such OSes, even programs like the Java VM need root/admin privileges to execute their JIT code. (See http://en.wikipedia.org/wiki/W%5EX for more details)

You don't need root privileges for self modifying code. Neither does the Java VM. — Mackie Messer, Apr 04 '11 at 09:41
I didn't know some OS were so strict. But it certainly makes sense in some applications. I do wonder however if executing Java with root privileges does actually increase security... — Mackie Messer, Apr 04 '11 at 10:15
@Mackie: I think it must decrease it, but maybe it can set some memory permissions then change the effective uid back to some user account...? — Tony Delroy, Apr 04 '11 at 10:16
Yes, I would expect them to have a fine grained mechanism to grant permissions to accompany the strict security model. — Mackie Messer, Apr 04 '11 at 10:41

score 17 · Answer 5 · answered Apr 04 '11 at 07:47

17

Some compilers used to use it for static variable initialization, avoiding the cost of a conditional for subsequent accesses. In other words they implement "execute this code only once" by overwriting that code with no-ops the first time it's executed.

answered Apr 04 '11 at 07:47

JoeG

12,994
1
38
63

1

Very nice, especially if it's avoiding mutex locks/unlocks. – Tony Delroy Apr 04 '11 at 08:14
2

Really? How does this for ROM-based code, or for code executed in the write-protected code segment? – Ira Baxter Apr 04 '11 at 09:15
1

@Ira Baxter: any compiler that emits relocatable code knows that the code segment is writeable, at least during startup. So the statement "some compilers used it" is still possible. – MSalters Apr 04 '11 at 13:44

score 16 · Answer 6 · answered Apr 04 '11 at 07:21

16

The Synthesis OS basically partially evaluated your program with respect to API calls, and replaced OS code with the results. The main benefit is that lots of error checking went away (because if your program isn't going to ask the OS to do something stupid, it doesn't need to check).

Yes, that's an example of runtime optimization.

answered Apr 04 '11 at 07:21

Ira Baxter

93,541
22
172
341

I fail to see the point. If say a system call is going to be forbidden by the OS, you will likely get an error back that you'll have to check in the code, won't you ? It seems to me that modifying the executable instead of returning an error code is kind of overengineering. – Alexandre C. Apr 04 '11 at 07:46
@Alexandre C. : you may be able to eliminate null pointer checks that way. Often it's trivially obvious for the caller that an argument is valid. – MSalters Apr 04 '11 at 07:52
@Alexandre: You can read the research at the link. I think they got fairly impressive speedups, and that would be the point :-} – Ira Baxter Apr 04 '11 at 09:17
2

For relatively trivial and non I/O-bound syscalls, the savings are significant. For example, if you're writing a deamon for Unix, there's a bunch of boiler-plate syscalls you do to disconnect stdio, set up various signal handlers, etc. If you know that the parameters of a call are constants, and that the results will always be the same (closing stdin, for example), a lot of the code you execute in the general case is unnecessary. – Mark Bessey Apr 04 '11 at 21:07
1

If you read the thesis, chapter 8 contains some really impressive numbers about non-trivial real time I/O for data acquisition. Remembering that this is a mid 1980s thesis, and the machine he was running on was 10? Mhz 68000, he was able in software to *capture* CD quality audio data (44,000 samples a second) with plain old software. He claimed that Sun workstations (classic Unix) could only hit about 1/5 of that rate. I'm an old assembly language coder from those days, and this is pretty spectacular. – Ira Baxter Apr 05 '11 at 17:52

score 9 · Answer 7 · answered Apr 04 '11 at 20:56

Many years ago i spent a morning trying to debug some self-modifying code, one instruction changed the target address of the following instruction, i.e., i was computing a branch address. It was written in assembly language and worked perfectly when i stepped through the program one instruction at a time. But when i ran the program it failed. Eventually, i realized that the machine was fetching 2 instructions from memory and (as the instructions were laid out in memory) the instruction i was modifying had already been fetched and thus the machine was executing the unmodified (incorrect) version of the instruction. Of course, when i was debugging, it was only doing one instruction at a time.

My point, self-modifying code can be extremely nasty to test/debug and often has hidden assumptions as to the behavior of the machine (be it hardware or virtual). Moreover, the system could never share code pages among the various threads/processes executing on the (now) multi-core machines. This defeats many of the benefits to virtual memory, etc. It also would invalidate branch optimizations done at the hardware level.

(Note - i do not included JIT in the category of self-modifying code. JIT is translating from one representation of the code to an alternate representation, it is not modifying the code)

All, in all, it's just a bad idea - really neat, really obscure, but really bad.

of course - if all you have is an 8080 and ~512 bytes of memory you might have to resort to such practices.

I don't know, good and bad don't seem to be the right categories to think about this. Of course you should really know what you are doing and also why you are doing it. But the programmer who wrote that code probably didn't want you to see what the program was doing. Of course it's nasty if you have to debug code like that. But that code was very likely meant to be that way. — Mackie Messer, Apr 04 '11 at 22:16
Modern x86 CPUs have stronger SMC detection than required on paper: [Observing stale instruction fetching on x86 with self-modifying code](https://stackoverflow.com/a/18388700). And on most non-x86 CPUs (like ARM), the instruction cache isn't coherent with the data caches so manual flush/sync is required before newly-stored bytes can be reliably executed as instructions. https://community.arm.com/processors/b/blog/posts/caches-and-self-modifying-code. **Either way, SMC performance is *terrible* on modern CPUs, unless you modify once and run many times.** — Peter Cordes, Nov 05 '18 at 02:19

score 7 · Answer 8 · answered Apr 04 '11 at 08:27

7

From the view of an operating system kernel every Just In Time Compiler and Linker Runtime performs program text self modification. Prominent example would be Google's V8 ECMA Script Interpreter.

answered Apr 04 '11 at 08:27

datenwolf

159,371
13
185
298

score 5 · Answer 9 · answered Apr 04 '11 at 08:43

Another reason of self-modifying code (actually a "self-generating" code) is to implement a Just-In-time compilation mechanism for performance. E.g. a program that reads an algebric expression and calculates it on a range of input parameters may convert the expression in machine code before stating the calculation.

score 5 · Answer 10 · answered Apr 04 '11 at 14:24

You know the old chestnut that there is no logical difference between hardware and software...one can also say that there is no logical difference between code and data.

What is self-modifying code? Code that puts values in the execution stream so that it can be imterpreted not as data but as a command. Sure there is the theoretical viewpoint in functional languages that there really is no difference. I'm saying on e can do this in a straightforward manner in imperative languages and compiler/interpreters without the presumption of equal status.

What I'm referring to is in the practical sense that data can alter program execution paths (in some sense this is extremely obvious). I am thinking of something like a compiler-compiler that creates a table (an array of data) that one traverses through in parsing, moving from state to state (and also modifying other variables), just like how a program moves from command to command, modifying variables in the process.

So even in the usual instance of where a compiler creates code space and refers to a fully separate data space (the heap), one can still modify the data to explicitly change the execution path.

No logical difference, true. Haven't seen too many self-modifying integrated circuits, though. — Ira Baxter, Apr 04 '11 at 21:39
@Mitch, IMO changing the exec path has nothing to do with (self-)modification of code. Besides, u confuse data with info. I can't answer yr comment [to my reply in LSE](http://english.stackexchange.com/questions/7476/good-movies-for-learning-english/12430#12430) b/c I'm banned form there, since Feabruary, for 3-year (1,000 days) for expressing in meta-LSE my pov that Americans and Brits do not own English. — Gennady Vanin Геннадий Ванин, Dec 14 '11 at 04:03

score 4 · Answer 11 · answered Apr 04 '11 at 13:01

4

I have implemented a program using evolution to create the best algorithm. It used self-modifying code to modify the DNA blueprint.

answered Apr 04 '11 at 13:01

David

4,786
11
52
80

phuclv · Answer 12 · 2020-11-06T16:52:59.780

One use case is the EICAR test file which is a legitimate DOS executable COM file for testing antivirus programs.

X5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*

It has to use self code modification because the executable file must contain only printable/typeable ASCII characters in the range [21h-60h, 7Bh-7Dh] which limits the number of encodable instructions significantly

The details are explained here

It's also used for floating-point operation dispatching in DOS

Some compilers will emit CD xx with xx ranging from 0x34-0x3B in places of x87 floating-point instructions. Since CD is the opcode for int instruction, it'll jump into the interrupt 34h-3Bh and emulate that instruction in software if the x87 coprocessor is not available. Otherwise the interrupt handler will replace those 2 bytes with 9B Dx so that later executions will be handled directly by x87 without emulation.

What is the protocol for x87 floating point emulation in MS-DOS?

Another usage is to optimize code during runtime

For example on an architecture without variable bit shifts (or when they're very slow) then they can be emulated using only constant shifts when the shift count is known far in advance by changing the immediate field containing the shift count in the instruction before control reaches that instruction and before the cache is loaded for running

It can also be used to change function calls to the most optimized version when there are multiple versions for different (micro-)architectures. For example you have the same function written in scalar, SSE2, AVX, AVX-512... and depending on the current CPU you'll choose the best one. It can be done easily using function pointers which are set at startup by the code dispatcher, but then you have one more level of indirection which is bad for the CPU. Some compilers support function multiversioning which automatically compiles to different versions, then at load time the linker will fix the function addresses to the desired ones. But what if you don't have compiler and linker support, and you don't want the indirection either? Just modify the call instructions yourself at startup instead of changing the function pointers. Now the calls are all static and can be predicted correctly by the CPU

score 1 · Answer 13 · answered Apr 05 '11 at 00:35

1

I run statistical analyses against a continually updated database. My statistical model is written and re-written each time the code is executed to accommodate new data that become available.

answered Apr 05 '11 at 00:35

David LeBauer

31,011
31
115
189

score 0 · Answer 14 · edited Nov 06 '18 at 09:40

The scenario in which this can be used is a learning program. In response to user input the program learns a new algorithm:

it looks up the existing code base for a similar algorithm
if no similar algorithm is in the code base, the program just adds a new algorithm
if a similar algorithm exists, the program (perhaps with some help from the user) modifies the existing algorithm to be able to serve both the old purpose and the new purpose

There is a question how to do that in Java: What are the possibilities for self-modification of Java code?

score 0 · Answer 15 · answered Apr 04 '11 at 17:10

0

The Linux Kernel has Loadable Kernel Modules which do just that.

Emacs also has this ability and I use it all the time.

Anything that supports a dynamic plugin architecture is essentially modifying it code at runtime.

answered Apr 04 '11 at 17:10

dietbuddha

8,556
1
30
34

5

hardly. having a dynamically loadable library which isn't always resident has very little to do with self-modifying code. – Dov Apr 04 '11 at 19:40

score -1 · Answer 16 · answered Apr 04 '11 at 14:39

-1

The best version of this may be Lisp Macros. Unlike C macros which are just a preprocessor Lisp lets you have access to the entire programming language at all times. This is about the most powerful feature in lisp and does not exist in any other language.

I am by no means an expert but get one of the lisp guys talking about it! There is a reason that they say that Lisp is the most powerful language around and the smart folks no that they are probably right.

answered Apr 04 '11 at 14:39

Zachary K

3,205
1
29
36

2

Does that actually create self modifying code or is it just a more powerful preprocessor (one that will generate functions)? – Brendan Long Apr 04 '11 at 23:16
@Brendan: indeed, but it *is* the right way to do preprocessing. There is no runtime code modification here. – Alexandre C. Apr 05 '11 at 10:34

Are there any smart cases of runtime code modification?

16 Answers16