4

I want to build a web based service that lets the user input some C code that the server will then compile and run and return results. I know, I know, security nightmare. So maybe I could go with chroot or lxc or something like that. There are good posts on stackoverflow about those. Another option is to use programming contest software.

What I am doing isn't for general programming purposes though. Users will be able to add code to a few stub functions and that is it. They don't need to be able to use pointers or arrays or strings. They shouldn't be able to open/close/read/write files or sockets or shared memory. They can't even create their own functions. They should only be able to do the following:

// style comments
/* */ style comments
declare variables of type int, double, float, int64_t, int32_t, uint64_t, uint32_t
for, while, do
+, -, *, /, % arithmetic operators ( * as dereference is NOT allowed )
( )
+, - unary operators
++, -- operators
math functions like sin, cos, abs, fabs, etc
a bunch of API functions that will exist
switch, case, break
{ }
if, else, ==, !=
=, +=, -=, *=, /=, etc

Is there a tool I can use to check a given chunk of C code to make sure it contains only those elements?

If I can't find an existing solution I can use Antlr or something similar to come up with it myself.

  • You might not want to compile at all: http://stackoverflow.com/questions/584714/is-there-an-interpreter-for-c – msw Apr 24 '13 at 17:47
  • 2
    On the other hand, if constrained as your mini-language appears to be, why subject your users to a syntax designed by geeks for geeks 35 years ago which emphasizes hardware-level (non)abstraction and lexical parsimony? That seems cruel. – msw Apr 24 '13 at 17:52
  • I considered that but those seem to be interested in implementing complete C. Do you know if any of those are only a small subset. In any case I don't think it would work for me as I will need this to be fast. – Ginger Snaps Apr 24 '13 at 17:53
  • I agree that C might not be the optimal choice. Is there another language you would recommend that has an easy way to set these kinds of limits? I don't think C is horrible as mostly the user will just put in mathematical expressions but C gives people the flexibility to do much more if they want to. – Ginger Snaps Apr 24 '13 at 17:55
  • 2
    Python has become the _de facto_ beginner's language in university CS courses. It also easily affords access limitations by just ripping out - for example - the `os` module. – msw Apr 24 '13 at 17:58
  • 2
    @msw: Does it prevent `[t for t in (1).__class__.__bases__[-1].__subclasses__() if t.__name__ == 'file'][0]('/etc/passwd').read()` from working? – Joker_vD Apr 24 '13 at 18:01
  • @Joker_vD it does if chrooted; thanks for pointing that out. – msw Apr 24 '13 at 18:05
  • Rather than C I'd pick Java or Python or some such. – Hot Licks Apr 24 '13 at 18:19

1 Answers1

1

For a real-world example of a web service that runs user code, check out the Travis CI continuous integration service. Open-source projects use it to run their unit tests in a centralized manner. The Travis process goes a bit like this:

  • Fire up a brand-new VM from a known-good configuration.
  • Load and compile the user code.
  • Run the tests and display results.
  • Discard the VM.

There is a time limit (10 minutes IIRC) to prevent people from running botnets on the system, but other than that, the VM's are fully functional and connected to the Internet. No need for restricted syntax or other artificial limitations.

The idea to keep in mind is that you'll never be able to keep a server secure from the horrors of user code, no matter how much you restrict the user. The alternative is just assuming the server is completely ruined the moment it's touched by user code and then just trash it, which is what Travis does. VM software usually has snapshot functionality to help this kind of thing.

Wander Nauta
  • 18,832
  • 1
  • 45
  • 62