0

The issue is I am not able to use python libraries like paramiko and psycopg2 even after adding their .whl file to S3 and then linking it to glue job using --additional-python-modules flag, I'm getting no module found error.

So I thought other workaround for this is to open the private subnet for pypi.org that way no developer in my team will need to create .whl or zip file to use external python libraries, but the question is whether every library is downloaded from pypi.org or not?

Other details my glue job uses connection which includes private subnet. And if don't add this connection I don't get module not found error but I am unable to connect to my RDB instance.

I can't use NAT gateway as our security team won't allow that.

Anyway to use this libraries in the glue code?

Nikhil Padole
  • 97
  • 1
  • 13
  • 1
    no, you can install a package using `pip` from other sources too, PyPi is just used the most: https://packaging.python.org/en/latest/tutorials/installing-packages/#use-pip-for-installing – Matiiss Dec 30 '21 at 13:59
  • Is there any other solution then? – Nikhil Padole Dec 30 '21 at 14:00
  • other solution as opposed to what? why can't you use `pip` for other sources? (how? you will need to read the link I gave tho as I likely won't be able to help on this topic more than I have already) – Matiiss Dec 30 '21 at 14:02
  • The solution I was thinking about was to whitelist pypi.org in glue connection's subnet, as I thought all the libraries are downloaded from pypi.org – Nikhil Padole Dec 30 '21 at 14:04
  • 1
    read about [installing from other indexes](https://packaging.python.org/en/latest/tutorials/installing-packages/#installing-from-other-indexes) and [from VCS](https://packaging.python.org/en/latest/tutorials/installing-packages/#installing-from-vcs) maybe that helps – Matiiss Dec 30 '21 at 14:06
  • try to add psycopg2-binary==2.8.6 to the configuration additional-python-modules in the AWS Glue. Then import psycopg2 in your code. you dont have upload any whl files. Not sure about paramiko. – Yuva Dec 31 '21 at 09:07
  • @Yuva This won't work as the subnet used by my glue job is private and doesn't allow traffic internet which will be required in order to download that psycopg2 library. – Nikhil Padole Jan 01 '22 at 11:20
  • 1
    Ok we had another approach that worked as well. Do a pip install psycopg2 to a specific folder in a LINUX based EC2 or any other machine. zip the folder where the library is installed, and upload to AWS Glue as a zip file alongwith your script. This works as well. – Yuva Jan 01 '22 at 14:59
  • 1
    You may also refer to this SO link for more info / solutions: https://stackoverflow.com/questions/36103034/importerror-no-module-named-psycopg2-psycopg/58305654#58305654 – Yuva Jan 01 '22 at 15:11

0 Answers0