61

It is known that AWS lambda may reuse early created objects of handlers, and it really does it (see FAQ):

Q: Will AWS Lambda reuse function instances?

To improve performance, AWS Lambda may choose to retain an instance of your function and reuse it to serve a subsequent request, rather than creating a new copy. Your code should not assume that this will always happen.


The question is regarding Java concurrency. If I have a class for a handler, say:

public class MyHandler {
    private Foo foo;
    public void handler(Map<String,String> request, Context context) {
       ...
    }
}

so, will it be thread-safe to access and work with the object variable foo here or not?

In other words: may AWS lambda use the same object concurrently for different calls?

EDIT My function is processed on an event based source, particularly it is invoked by an API Gateway method.

EDIT-2 Such kind of question rises when you want to implement some kind of connection pool to external resources, so I want to keep the connection to the external resource as an object variable. It actually works as desired, but I'm afraid of concurrency problems.

EDIT-3 More specifically I'm wondering: can instances of handlers of AWS lambda share common heap (memory) or not? I have to specify this additional detail in order to prevent answers with listing of obvious and common-known things about java thread-safe objects.

Andremoniy
  • 34,031
  • 20
  • 135
  • 241
  • IMO "retain an instance of your function and reuse it" is exactly the same as "use same object". And very likely concurrently. – zapl Jun 24 '16 at 15:18
  • @zapl It doesn't mean exactly concurrently. It can be used like threads inside java `ExecutorService` - they are retained, but not used concurrently – Andremoniy Jun 24 '16 at 15:19
  • Sure, they could create new handler instances per thread or ensure otherwise they are not used concurrently but I don't see it mentioned explicitly anywhere. On the other hand, there are pieces like *"The code must be written in a “stateless” style [...] artifacts may not extend beyond the lifetime of the request"* in the FAQ. – zapl Jun 24 '16 at 15:38
  • 1
    @zapl sure, but again, stateless style doesn't mean that they can be not thread-safe. – Andremoniy Jun 24 '16 at 15:47
  • What context are you using the function in? is it for processing streams? are you using threads within handler function? If you are processing only one event at a time and there are no threads within the handler function, this code will be thread safe. – Shibashis Jun 24 '16 at 16:41
  • 1
    @Shibashis I use it in context of hundred requests per second. Naturally I wouldn't ask such question if my context would be singlethreaded. – Andremoniy Jun 24 '16 at 16:50
  • 2
    Your question is not clear. if within your function ,you ensure that your function accesses the foo variable in thread safe manner. You should not be concerned about thread safety because of reuse of the lambda function. It reuses only when the function is not processing another request.Somewhat like an object pool. Each instance of the function is run in an underneath container and is separate. – Shibashis Jun 24 '16 at 16:55
  • 2
    @Shibashis what is actually not clear in my question? I asked precise question: will it be thread-safe use object variables regarding possibility of reusing instances of handlers? What is not clear? Regarding to second part of your commentary - if you have links to exact documentation where this features are described, you can provide it as an answer and it will be accepted. – Andremoniy Jun 24 '16 at 17:07
  • Clarity is needed on whether your function is processing stream based event source? Clarity is needed on what processing you are doing inside the function? I could not find any documentation which directly answers your question, but the following link gives good information on concurrency http://docs.aws.amazon.com/lambda/latest/dg/concurrent-executions.html . You can deduce how functions are instantiated for processing requests from the article. Once you read and happy with the documentation, I will make this an answer. – Shibashis Jun 24 '16 at 17:16
  • @Shibashis Ok, I've edited my question: my function is processed on event based source, particular it is invoked by API Gateway method. I want to keep connection to external resources. It works, but I want to know, would it be generally thread safe. – Andremoniy Jun 24 '16 at 17:20
  • Yes ur code will be thread safe. But conn pool will be a bad idea, because lambda function lifecycle can terminate the instance any time.. so the connections on the pool will not gracefully close.check this thread on aws forum https://forums.aws.amazon.com/thread.jspa?threadID=216000 – Shibashis Jun 24 '16 at 17:44
  • I am talking specifically about database conn pools. Http conn pools may still be okay. – Shibashis Jun 24 '16 at 17:49
  • And what do you suggest instead? Not to use lambda? – Andremoniy Jun 24 '16 at 18:12

3 Answers3

78

May AWS lambda use same object concurrently for different calls?

Can instances of handlers of AWS lambda share common heap (memory) or not?

A strong, definite NO. Instances of handlers of AWS Lambda cannot even share files (in /tmp).

An AWS Lambda container may not be reused for two or more concurrently existing invocations of a Lambda function, since that would break the isolation requirement:

Q: How does AWS Lambda isolate my code?

Each AWS Lambda function runs in its own isolated environment, with its own resources and file system view.

The section "How Does AWS Lambda Run My Code? The Container Model" in the official description of how lambda functions work states:

After a Lambda function is executed, AWS Lambda maintains the container for some time in anticipation of another Lambda function invocation. In effect, the service freezes the container after a Lambda function completes, and thaws the container for reuse, if AWS Lambda chooses to reuse the container when the Lambda function is invoked again. This container reuse approach has the following implications:

  • Any declarations in your Lambda function code remains initialized, providing additional optimization when the function is invoked again. For example, if your Lambda function establishes a database connection, instead of reestablishing the connection, the original connection is used in subsequent invocations. You can add logic in your code to check if a connection already exists before creating one.

  • Each container provides some disk space in the /tmp directory. The directory content remains when the container is frozen, providing transient cache that can be used for multiple invocations. You can add extra code to check if the cache has the data that you stored.

  • Background processes or callbacks initiated by your Lambda function that did not complete when the function ended resume if AWS Lambda chooses to reuse the container. You should make sure any background processes or callbacks (in case of Node.js) in your code are complete before the code exits.

As you can see, there is absolutely no warning about race conditions between multiple concurrent invocations of a Lambda function when trying to take advantage of container reuse. The only note is "don't rely on it!".

Community
  • 1
  • 1
Leon
  • 31,443
  • 4
  • 72
  • 97
0

Taking advantage of the execution context reuse is definitely a practice when working with AWS Lambda (See AWS Lambda Best Practices). But this does not apply to concurrent executions as for concurrent execution a new container is created and thus new context. In short, for concurrent executions if one handler changes the value other won't get the new value.

Farhan Haider
  • 1,244
  • 1
  • 13
  • 22
0

As I see there is no concurrency issues related to Lambda. Only a single invocation "owns" the container. The second invocation will get an another container (or possible have to wait until the first one become free).

BUT I didn't find any guarantee the Java memory visibility issues cannot happen. In this case changes done by the first invocation could stay invisible for the second one. Or the changes of the first invocation will be written to RAM after the changes done by the second invocation.

In the most cases visibility issues are handled in the same way as concurrency issues. Therefore I would suggest to develop Lambda function thread-safe (or synchronized). At least as long as AWS won't give us a guarantee, that they do something on their side to flush CPU state to the memory after every invocation.

30thh
  • 10,861
  • 6
  • 32
  • 42