25

I'm trying to understand the exact effect of including .pyc files with a Python package to AWS Lambda.

The very few references I was able to find on this say there is no need to include .pyc files with the package. However, I'm seeing huge performance hit with my Lambda function when I don't include these files.

Trying to include a library with my package (Jinja2, for example), when omitting its .pyc files, the time it takes for the import jinja2 is always over 3 seconds.

When I do provide .pyc files, the first execution still takes 3 seconds but after that, it goes down to 100-200ms (I guess until the function gets eventually unloaded?).

I found this SO question which may suggest that AWS Lambda is unable to save its own compiled files, does this make sense?

My questions are - is there any definite source of information regarding the usage .pyc files with Python on AWS Lambda? Is there any way to make AWS Lambda save its own .pyc files? Or should I just continue to include them with my package?

danielv
  • 3,039
  • 25
  • 40
  • 4
    Interesting question. Instinctively I'd say if `pyc`-files are present or not should only matter for the initialization of each AWS Lambda container, as afterwards everything should be kept in RAM anyway. So I'm quite puzzled that you see this delay for each request. Have you done any structured tests regarding the performance impact? – Dunedan Aug 13 '17 at 20:24
  • Well, I isolated and timed to logs the import statement and invoked the function both via the AWS console and remotely, with and without pyc files. The results are always the same. With pyc files only the first invocation after code update is slow while without including pyc files running the function over and over again always takes over 3 seconds (it gives timeout error with the default Lambda settings). – danielv Aug 13 '17 at 20:30
  • 1
    As lambdas are run on a farm of servers and potentially could be run in a different node each time, it's expected that the generated pyc files would be discarded (along with any other file generated by your code) in between executions. So if you notice improvement of the performance when you include the compiled Python files, I would suggest you to keep including them in your package. – cristianoms Aug 13 '17 at 21:05
  • @cristianoms Yes, I can include them for now but the zip file gets bigger and upload times slower so I wondered what was the reason for this behavior and whether there isn't a better approach. Couldn't find many mentions about this except in few places where the recommendation is to exclude them. – danielv Aug 13 '17 at 21:10
  • I have some questions to help me better understand your situation: How are you generating you pyc files? Are you using some kind of build tool? How big if that project? Is the size of the generated zip being so dramatically inflated by the compiled files? Maybe you should be running your code on your own ec2 instance rather than on lambda. – cristianoms Aug 13 '17 at 21:31
  • BTW, here's a good answer to a question about the pyc files existence (thought you'd find it enlightening): https://stackoverflow.com/a/2998544/770557 – cristianoms Aug 13 '17 at 21:36
  • @cristianoms the generation of the pyc doesn't really matter, but for consistency, I use `python -m compileall`. The project is not big but expected to grow. And I know I can use EC2 (and already do for other projects) but my question is about a particular and puzzling issue with AWS Lambda... – danielv Aug 15 '17 at 11:08

1 Answers1

4

I don't think so, the .pyc files is a Byte code which Python interpreter compiles the source to and then this code is executed. With this in mind, we can control what files should the "python virtual machine runs".

I Think the best solution is like you said:

just continue to include them with my package.

Once you do that, the code executor already have your bytecode files.

Ederson Badeca
  • 363
  • 1
  • 9
  • It should probably even be possible to just upload the `.pyc` or `.pyo` files and omit the `.py` files for any dependencies (and potentially even the handler as you can also python will also execute a byte-code file e.g. `python file.pyc`) – Jann Jul 25 '19 at 21:41
  • 1
    I tried uploading only the `.pyc` files without the source, and it fails with **ModuleNotFoundError** – Elliott B Oct 06 '19 at 22:21
  • @ElliottB This comment is old but I found that `.pyc` need to be moved for them to replace `.py`. Also, we should have all `.pyc` files, which we can get by `python -m compileall .` – Avinash Thakur May 27 '22 at 12:14