1

I'm developing a loader|linker for ELF-format object files. I use mmap() for mapping the code section to the process. The idea is to load and modify the relocations in the code section. But I encountered a problem of instability when using mmap() in PROT_EXEC|PROT_READ|PROT_WRITE mode.

The simple program is bx lr. The code is for .arm 1e ff 2f e1. If I load and execute it from file (four bytes - 1e ff 2f e1), everything is fine.

        int fd = open("bx.cod", 0, 0);
        char *p = mmap(0, len, PROT_EXEC, MAP_SHARED, fd, 0);
        close(fd);
        proce = (System_RBProc)p;
        (*proce)();

But if I allocate and modify memory (writing the same code - 1e ff 2f e1), I sometimes get Illegal instruction

        char *p = mmap(0, len, PROT_EXEC|PROT_READ|PROT_WRITE,
            MAP_PRIVATE|MAP_ANONYMOUS, 0, 0);
        memcpy(p, "\x1e\xff\x2f\xe1", 4);
        rc = mprotect(p, len, PROT_EXEC);
        proce = (System_RBProc)p;
        (*proce)();

Unstability means that Illegal instruction is a rare occasion in the last case. But ...

Rachid K.
  • 4,490
  • 3
  • 11
  • 30
  • Are you checking `mmap` and `mprotect` for failure? – Nate Eldredge Dec 21 '20 at 06:51
  • 2
    I'm not an ARM expert, but aren't there various cache flushing things you have to do for this sort of "self modifying code"? https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/caches-and-self-modifying-code suggests `__clear_cache()`. If the previous contents of that memory are already in instruction cache, your attempt to overwrite it won't necessarily propagate from data to instruction cache. – Nate Eldredge Dec 21 '20 at 06:59

2 Answers2

3

Thanks to Nate Eldredge, there was the cache coherence problem (sure not software bug). The solution is to set PROT_EXEC flag AFTER code modifications. In this case only the final code version is copied to instruction cache.

char *p = mmap(0, len, PROT_READ|PROT_WRITE,
    MAP_PRIVATE|MAP_ANONYMOUS, 0, 0);
memcpy(p, "\x1e\xff\x2f\xe1", 4);
rc = mprotect(p, len, PROT_EXEC);
-1

The following solution demonstrates stable behaviour:

char *p;
posix_memalign(&p, pagesize, len);
memcpy(p, "\x1e\xff\x2f\xe1", 4);
rc = mprotect(p, len, PROT_EXEC);
proce = (System_RBProc)p;
(*proce)();

I can only assume a software bug in raspberry pi4 mmap() implementation, posix_memalign() routine is free of such problems.

  • A bug seems really unlikely, `mmap` is a very heavily used and tested kernel function. I strongly suspect the problem is in your code, and the fact that it appears to work with `posix_memalign` is coincidence. – Nate Eldredge Dec 21 '20 at 06:55
  • Yes, it really seems unlikely, my experience shows that 100% such assumptions are resulted in my new code. But I have no other explanation for now. – Dmitri Dagaev Dec 21 '20 at 07:38