2

I want to use AWS Lambda to run JavaScript code that users submit over the web. My Lambda function would return back the return value, stderr, and stdout. What problems could I run into in doing this?

Will a malicious user be able to submit code that causes problems for the Lambda function? Do the changes users make, for example to the node environment or filesystem, persist across invocations? Is there any way to prevent that?

Instead of eval() I can write the file(s) to the Lambda filesystem and call:

const userCodeProcess = require('child_process').fork('user_code.js')
userCodeProcess.on('message', response.send)
at.
  • 50,922
  • 104
  • 292
  • 461

2 Answers2

6

A user will be able to submit code to your Lambda function that could cause issues in a few circumstances:

  • Filesystem - Your users can submit code that can make changes to your filesystem, and those changes will persist across functions called on a reused container. Not all user code requests will run on the same container, but if two requests are made close together the same container and the same "scratch" disk space will be accessible. A way to potentially prevent this is if you can restrict included functions, such as fs, to specific directories that you define upon the function call, so that you can make a random directory for every request in the /tmp directory. I'm not sure whether this is possible. You'd have to make sure that users couldn't require fs again on it's own. A better recommendation is to use something like safe-eval, which I'll discuss more later.
  • Environment Variables - The code run in eval can access Node environment variables. If you need to make any requests to databases, other AWS products, etc, then you'll need to provide a policy to the Lambda function, unless you hardcode credentials, and that information can be access by the user code. You'd have to ensure that whatever credentials you provide to the system are not in environment variables, and if they are, then that you are ok with your users accessing them.
  • Throttling - You may run into issues of throttling. By default, AWS allows only 100 concurrent invocations. If you don't limit the request with which your users submit their own JS, then you may run into issues of requests being denied due to hitting the limit. I requested a limit increase recently and I was able to get mine increased to 3000 concurrent requests without them having to do an evaluation of whether their current infrastructure can handle it, so they might give that to you no-problem as well.
  • Expenses - I'm sure you've thought about function timeouts, but you'd want to make sure you capped function run time so that your users aren't racking up your AWS bills. Additionally, it may seem small, but on a large scale a bunch 6MB request responses adds up, so data transfer in and out could be a big deal and something that you might want to limit within your code.
  • Limits Your users may encounter, themselves, limits with container /tmp directories. They might run into issues of the max payload return size. Other limit questions can be researched here.

From what I understand, you can't break a Lambda function container with code, such that future requests don't run. I believe that a new container will be started if the existing container doesn't exist, or if the existing container is experiencing some issue.

Recommendation

Based on a bit of additional research I just did, you might want to run your javascript code within some type of context, such that your users can only access Node API endpoints that you want them to, as well as your own system defined variables. Utilizing a tool like safe-eval might work. Other people have asked the question of executing eval in context, such as here, or perhaps you could prepend use strict; and define your function variables each time you call eval, such as described here.

Additionally, before you close out the function and return your stderr and stdout you could just clear your /tmp directory. My concern with this is if AWS utilizes the same container when executing two requests concurrently and you delete the "scratch" space of two functions executing at the same time. I haven't been able to find a decisive answer to this in my research.

I'd still say at this point that your /tmp directory is where you're going to run into the most potential issues. If you can figure out a way to limit use to a single directory without being able to navigate up to parent directories, or whether AWS will reuse a container at the same time it is being used in another function, then I think you can probably use Lambda to execute user-provided code without too much concern. Or, you could just prevent your users from accessing your filesystem by excluding it from whatever context method you use.

Additional reading from AWS concerning container reuse is here.

Community
  • 1
  • 1
forrestmid
  • 1,494
  • 17
  • 25
1

Just bear in mind that besides eval() you can also run arbitrary code by instantiating anonymous functions within the nodejs runtime.

This will give you the added benefit of being able to return values from your input code without having to use a child_process.

// POST https://g0623a10zf.execute-api.ap-southeast-2.amazonaws.com/prod/exec-script
// Content-Type: text/plain
//
// return process.env;

exports.handler = (event, context, callback) => {
    var err = '', result;
    try {
        // ie: result = new Function('return process.env;')();
        result = new Function(event.body)(); 
    } catch (e) {
        console.log(err = e);
    }
    context.succeed({
        statusCode: err ? '500' : '200',
        body: err || result
    }); 
    // Terminate runtime
    callback(err, result);
};

Obviously, you still have all the security concerns that @forrestmid raised (and a few others), but you can skip the bit where you write to a file and then execute it.

There is also the added benefit that this approach does not allow require calls to nodejs modules, which will take away a lot of your headache. I.e. Every lambda is executed with an aws-sdk module pre-installed, so malicious users could look at ways of exploiting any unsecured AWS resources by first doing require('aws-sdk').

Steven de Salas
  • 20,944
  • 9
  • 74
  • 82