/tmp
, as the name suggest, is only a temporary storage. It should not be relied upon for any long term data storage. The files in /tmp
persist for as long as lambda execution context is kept alive. The time is not defined and varies.
To overcome the size limitation (512 MB) and to ensure long term data storage there are two solutions employed:
The use of the EFS is easier (but not cheaper), as this will present a regular filesystem to your function which you can write and read directly. You can also re-use the same filesystem across multiple lambda functions, instances, containers and more.
The S3 will be cheaper but there is some extra work required from you to seamlessly use in lambda. Pandas does support S3, but for seamless integration you would have to include S3FS in your deployment package (or layer) if not already present. The S3 can also be accessed from different functions, instances and containers.