1

When creating an AWS Lambda using Python:

  1. Can the Lambda access local imports if the modules are included in the Lambda handler zip; and
  2. What are the implications of including the __pycache__ directories in the zip?

Question 1: Can the runtime access local imports?

The AWS documentation focuses on the Lambda handler Python file containing the handler function itself. This is obviously must be included in the deployment zip. But we don't want one big function, or even several functions or classes in one big file.

If using the usual Python approach of creating sub-directories, containing modules and packages, included in the directory containing the handler itself, and included in the zip that is uploaded to the AWS Lambda handler, will that code be accessible at run-time, and therefore be importable by the Lambda handler?

I'm not referring to the AWS Lambda support for "layers", which is normally used for providing access to packages that are installed in a virtual environment with pip etc. I (think I) understand that support and am not asking about that.

I specifically just want to clarify: can the Lambda handler import from local files, for instance, an adjacent definitions.py being referenced by from definitions import * (please no judgements about don't import star :-) as long as it's also in the zip?

Question 2: Is it good practice to include the __pycache__ directories?

In the AWS Python Lambda deployment documentation the output of a zip command shows the packages included with the Lambda handler Python file, also including a __pycache__ directory. Additional libraries are also shown but it seems intended that these are collected from the layers.

The AWS documentation shows the inclusion of __pycache__ but no mention is made at all.

I believe AWS Lambda run-times are certain specific versions of Python running on AWS Linux images. I'm currently forced to develop on Windows :-(. Will this mismatch cause issues for the run-time? Would other considerations come into play such as ensuring included bytecode is the same version of Python as the runtime?

Doesn't Python expect bytecode for particular packages in a particular relative location? Presumably the top-level (in the zip) __pycache__ directory should only contain handler routine bytecode?

Does the Lambda runtime even use the __pycache__ directories? If this is workable and working, given the Lambda may only run once before being destroyed, does that imply that developers should put effort in to providing the bytecode in the Lambda zip to improve Lambda performance? In this case, is it necessary to run sufficient tests across the code before zipping it for Lambda, to ensure all the bytecode is generated?

Context

I have reviewed various articles on creating an AWS Lambda zip using Python, including the AWS documentation, but the content is shallow and simplistic, failing to clarify the precise "reach" of the runtime. It's not even in the AWS Lambda Handler Cookbook.

I do not (yet) have access to a live AWS environment to test this out, and given this broad omission in the on-line documentation and community commentaries (most articles just parrot the AWS documentation anyway, albeit with worse grammar), I thought it would be good to get a clarification on SO.

NeilG
  • 3,886
  • 2
  • 22
  • 30
  • Yes, it's a full fledged Python interpreter, so it can access modules. Further, it will use python bytecode if it matches the standard pre-requisites, but it's generally not worth the energy to package it, since it's pretty rare that the time to compile some code will be the pain point in a lambda function. – Anon Coward Feb 09 '23 at 04:26
  • I guess this answer seems to imply a "yes" to question 1, but the answer is a workaround to an original question so the question and answer now no longer match, and also it doesn't answer this question fully: https://stackoverflow.com/a/63958126 – NeilG Feb 09 '23 at 05:18
  • Bytecode is not necessarily stable across Python releases or virtual machines: https://docs.python.org/3.10/glossary.html#term-bytecode / https://peps.python.org/pep-3147/#background / https://docs.python.org/3.10/reference/import.html?highlight=bytecode#cached-bytecode-invalidation. I'm not sure if there's a difference across platforms, but I suppose if you wanted to really tune the performance you could opt to manage your included `__pycache__` directories then ... – NeilG Feb 09 '23 at 05:30
  • One of the disadvantages of AWS Lambda is the start-up times, @AnonCoward. I've seen 20ms improvements quoted on very small code bites. I guess that could reach hundreds of milliseconds for larger code, and if you factor in installed packages it could add up to a human - noticeable response latency? – NeilG Feb 09 '23 at 05:32
  • Wow. DuckDuckGo has indexed your comment **already** @AnonCoward ! https://duckduckgo.com/?q=can+a+linux+python+runtime+use+windows+bytecode&ia=web – NeilG Feb 09 '23 at 05:38
  • Possibly, likely you're going to want to provisioned concurrency for the lambda, select an appropriate size, and move all possible work to outside of the lambda handler so it's pre-cached for applications of lambda where such timing is critical, then maybe investigate getting the bytecode deployed if it'll help. – Anon Coward Feb 09 '23 at 05:39
  • Thanks @AnonCoward, I guess we're creeping away from the original question now but that's a very interesting thread. I believe it's still related - what you're saying is, for example, create a function in an adjacent python file so it's "not in the handler", (or it can possibly even still be in the handler *file* but not in the handler *function*?), cache it's responses with something like `@lru_cache`, and then bingo! the responses are cached (and the code is imported) across Lambda calls, (as long as it's kept warm)? – NeilG Feb 09 '23 at 05:50
  • FYI: I do now see this small paragraph in the AWS Lambda documentation which answers question 2, albeit in contradiction to the implications of their other documentation which apparently *does* include `__pycache__` directories. Anyway, they are apparently saying, "No. Don't include `__pycache__`, it might not work.": https://docs.aws.amazon.com/lambda/latest/dg/python-package.html#python-package-pycache – NeilG Aug 14 '23 at 07:34

0 Answers0