5

I'm developing an online judge system for programming contests. Since C/C++ inline assembly is not allowed in certain programming contests, I would like to add the same restriction to my system.

I would like to let GCC produce an error when compiling a C/C++ program containing inline assembly, so that any program containing inline assembly will be rejected. Is there a way to achieve that?

Note: disabling inline assembly is just for obeying the rules, not for security concerns.

wtz
  • 426
  • 4
  • 15
  • 3
    `#define __asm "DO NOT USE INLINE ASSEMBLY"`? – NathanOliver Mar 16 '19 at 01:25
  • That's only one of 3 keywords: `asm`, `__asm__` and `__asm`. And if this is trying to stop motivated humans from using asm, not accident prevention, someone could use CPP macros to paste together `__as` and `m__` to get `__asm__` to appear in the preprocessor output. I don't think you can't defend against specific tokens with CPP. – Peter Cordes Dec 26 '20 at 07:05
  • Also of course `#undef __asm__` would be the simplest way to defeat a `#define` or `-D` option. As klutt's answer shows, future readers must not depend on this for security, only the OP's goal of preventing usage without intentional workarounds to keep honest users honest. If you let people compile C and run it, assume that it can execute arbitrary machine code; security at the system-call level is where you should aim your efforts. – Peter Cordes Feb 09 '23 at 06:27

1 Answers1

12

Is there a way to disable inline assembler in GCC?

Yes there are a couple of methods; none useful for security, only guard-rails that could be worked around intentionally, but will stop people from accidentally using asm in places they didn't realize they shouldn't.

Turn off the asm keyword in the compiler (C only)

To do it in compilation phase, use the parameter -fno-asm. However, keep in mind that this will only affect asm for C, not C++. And not __asm__ or __asm for either language.

Documentation:

-fno-asm

Do not recognize "asm", "inline" or "typeof" as a keyword, so that code can use these words as identifiers. You can use the keywords "__asm__", "__inline__" and "__typeof__" instead. -ansi implies -fno-asm.

In C++ , this switch only affects the "typeof" keyword, since "asm" and "inline" are standard keywords. You may want to use the -fno-gnu-keywords flag instead, which has the same effect. In C99 mode (-std=c99 or -std=gnu99), this switch only affects the "asm" and "typeof" keywords, since "inline" is a standard keyword in ISO C99.

Define the keyword as a macro

You can use the parameters -Dasm=error -D__asm__=error -D__asm=error

Note that this construction is generic. What it does is to create macros. It works pretty much like a #define. The documentation says:

-D name=definition

The contents of definition are tokenized and processed as if they appeared during translation phase three in a #define directive. In particular, the definition will be truncated by embedded newline characters.

...

So what it does is simply to change occurrences of asm, __asm, or __asm__ to error. This is done in the preprocessor phase. You don't have to use error. Just pick anything that will not compile.

Use a macro that fires during compilation

A way to solve it in compilation phase by using a macro, as suggested in comments by zwol, you can use -D'asm(...)=_Static_assert(0,"inline assembly not allowed")'. This will also solve the problem if there exist an identifier called error.

Note: This method requires -std=c11 or higher.

Using grep before using gcc

Yet another way that may be the solution to your problem is to just do a grep in the root of the source tree before compiling:

grep -nr "asm"

This will also catch __asm__ but it may give false positives, for instance is you have a string literal, identifier or comment containing the substring "asm". But in your case you could solve this problem by also forbidding any occurrence of that string anywhere in the source code. Just change the rules.

Possible unexpected problems

Note that disabling assembly can cause other problems. For instance, I could not use stdio.h with this option. It is common that system headers contains inline assembly code.

A way to cheat above methods

Aside from the trivial #undef __asm__, it is possible to execute strings as machine code. See this answer for an example: https://stackoverflow.com/a/18477070/6699433

A piece of the code from the link above:

/* our machine code */
char code[] = {0x55,0x48,0x89,0xe5,0x89,0x7d,0xfc,0x48,
0x89,0x75,0xf0,0xb8,0x2a,0x00,0x00,0x00,0xc9,0xc3,0x00};

/* copy code to executable buffer */    
void *buf = mmap (0,sizeof(code),PROT_READ|PROT_WRITE|PROT_EXEC,
            MAP_PRIVATE|MAP_ANON,-1,0);
memcpy (buf, code, sizeof(code));

/* run code */
int i = ((int (*) (void))buf)();

The code above is only intended to give a quick idea of how to trick the rules OP has stated. It is not intended to be a good example of how to actually perform it in reality. Furthermore, the code is not mine. It is just a short code quote from the link I supplied. If you have ideas about how to improve it, then please comment on 4pie0:s original post instead.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
klutt
  • 30,332
  • 17
  • 55
  • 95
  • You don't need `mmap` / `memcpy`, just use `const char code[] = { ... };` static storage class with `const` lets it go in `.rodata`, which is linked as part of the Text segment of the executable, so it's executable. – Peter Cordes Mar 16 '19 at 05:11
  • @PeterCordes Maybe so, but it is just to serve as an example on how you can cheat these methods. Not to teach the optimal (if such exists) way to execute binary data. – klutt Mar 16 '19 at 05:13
  • Sigh... your example with machine code in a string reminds me of old days when I did put machine code in the first BASIC line after a REM statement - yes in many languages you can embed machine code. It's evil. ;-) – reichhart Mar 16 '19 at 09:25
  • @Broman: It's worth pointing out that you don't need to `mmap` or `mprotect` to make executable pages. You could still block your way with a wrapper macro for `mmap` that filtered out `PROT_EXEC` from the flags. But the fact that read-only data is already in executable pages blows the doors wide open, allowing no reliable way to detect casting data to a function pointer if people hide it behind a `void*` variable or something `static inline` function. – Peter Cordes Mar 16 '19 at 17:59
  • @PeterCordes I think OP is grateful if I just provide a proof of concept, rather than a tutorial. :) – klutt Mar 16 '19 at 19:04