3

I need help in order to have pyarrow as a lambda layer for my lambda function. I am trying to read/write parquet file and I am getting below error, "errorMessage": "Unable to find a usable engine; tried using: 'pyarrow', 'fastparquet'.\npyarrow or fastparquet is required for parquet support".

I tried myself creating layer by installing pyarrow in my ec2 with below command, pip3 install pandas pyarrow -t build/python/lib/python3.7/site-packages/ --system

but the zip file is getting created with > 300 mb, and, hence i can not have that as lambda layer.

any suggestion or solutions.

Thanks,

Soumya
  • 115
  • 2
  • 8
  • Try the following post https://stackoverflow.com/questions/47984322/read-parquet-file-stored-in-s3-with-aws-lambda-python-3/62143576#62143576 I've also provided an alternative answer throught AWS sam cli. – Miguel Trejo Jun 02 '20 at 01:56
  • I posted a solution to this question here. https://stackoverflow.com/a/72710488/5561737 – Swakeert Jain Jun 23 '22 at 05:11

1 Answers1

1

Firstly, all the packages are need to be in a directory called python, nothing more, nothing less, and you can zip the whole python directory and upload to lambda. Secondly, pandas and pyarrow are pretty big. I did use them both in one lambda function without any issue, but I'm afraid you may need to separate those two packages as two layers to make it work. Do NOT use fastparquet, it is too big and exceed the 250MB limitation of lambda.

Jason Lei
  • 47
  • 4