How can I stop programs executed by a server from deleting files?

Question

I am working on a project which is a system where students can submit their coding assignments to a server where the server then executes their code as part of tests and then returns the grades they received based on the results of the tests executed.

There is a security concern that where the student submitting the code could "damage" the server by including code to delete the directory where the system's files are stored. The files are stored in a directory hierarchy where if the student somehow figured out the path to it, they could easily code their program to delete this directory.

I have since setup permissions so that the server is run under a different user. This user only has access to a single directory that stores all the submissions for that module. They could still theoretically wipe out this directory, but it is better than deleting the whole system. While it is still not ideal, I am not sure how to approach it.

The server is coded in Python. I have tried using os.chown etc to change the ownership of directories to a single user, however, I found out that the program needs to be run under a superuser to change ownership and also for calls to os.setuid and os.setgid.

Basically, my question is, is there any way to run a program while restricting it to the directory it's running within? This would involve only allowing it to delete files/folders within its working directory. I know there is a chroot command but that also requires superuser privilegs.

It is also not possible to run the program under a different user without sudo privileges either. The programs are executed using subprocess.Popen().

I know it's a long shot as I have tried a lot of research with permissions and the current solution restricting deletion down to the submissions data directory is as far as I could get. It is still not ideal however and the server will not be allowed to be run with sudo privileges.

If there are any program attributes that can be set to prevent that program from deleting files, it would be great, but I don't think such a system exists. I may have to resort to "scanning" the submitted code file for dangerous calls and reject the file if there are any such calls in it.

The current directory hierarchy is used:
.handin/module-code/data (data is where submissions for each student are stored)`

Currently, the data directory is created with a group handin which allows any members of that group to create directories inside it. With the server running under a user handin, it creates directories/files inside in that data directory with user handin and group handin. So, the only files the server could delete as user handin is all directories underneath data, rather than the whole .handin directory.

Underneath data, you have directories named from the student ids, e.g. .handin/module-code/data/12345678 and underneath that you have a directory with the assignment name. The assignment directory is the directory the code is executed in. It would be ideal if it would be that directory that could only be deleted, but if not, the student-id directory.

Maybe there are more elegant ways of solving it, but if you want full encapsulation, you could start a docker container and mount the individual folders to the container, executing it in a completely seperated environment and putting the result back there — FloLie, Jun 01 '21 at 13:17
Also as you already use ```Popen```, here is a thread about executing child processes as a different user https://stackoverflow.com/questions/1770209/run-child-processes-as-different-user-from-a-long-running-python-process/6037494#6037494 and then have a user with very narrow rights for each student folder — FloLie, Jun 01 '21 at 13:19
And here is a python sandbox tool. I have not tested it, but it could be a good starting point for you https://doc.pypy.org/en/latest/sandbox.html — FloLie, Jun 01 '21 at 13:30
@FloLie, I believe using Popen to start process as different user requires you to be running as root? — Paul Bryan, Jun 01 '21 at 14:05

score 0 · Answer 1 · answered Jun 08 '21 at 12:01

So, I have solved the problem using separate Docker containers for each execution. I created separate images for different languages in the programs that would be executed in. I then created a user in these containers that had just enough permissions to create/delete files inside their home directory, essentially sandboxing it.

I then used the epicbox python module (https://pypi.org/project/epicbox/) which was created by Stepik.org to grade programming assignments in their own containers (very similar to the problem that I needed to solve).

That creates a volume internally to allow each docker container that is run to share files:

    with epicbox.working_directory() as workdir:
      epicbox.run(....)
      epicbox.run(....)
      .....
      epicbox.run(....)

Each run call spins up a docker container but with the internally created volume, each container can access files produced from the previous call to run(). The module allows you to upload files from your local machine to the docker container and then compile/execute them there.

I did have to do some "hacks" to configure it to my requirements (change the default working directory in the docker container as epicbox did not provide a method to change it easily). This solution adds a few extra seconds to the execution time when compared to executing on the server itself, but for my use case, it is an acceptable trade-off for security.

How can I stop programs executed by a server from deleting files?

1 Answers1