Background information:
- I am connecting to an AWS RDS database
- The code resides within a docker container
- I am running the docker container locally and on AWS Batch
- For connecting locally, I have my own IAM user credentials
- For connecting on AWS Batch, the pod assumes an IAM role
I have been using three different ways to connect to the database both locally (from within docker container) and on AWS batch (from within the pod).
psql
binary:
export AWS_PROFILE="{YOUR_PROFILE}"
export RDSHOST="redacted.redacted.us-east-1.rds.amazonaws.com" export PGPASSWORD="$(aws rds generate-db-auth-token --hostname $RDSHOST --port 5432 --region us-east-1 --username my user)"
psql "host=$RDSHOST port=5432 dbname=mydb user=myuser password=$PGPASSWORD"
This works both from my machine and on AWS Batch. Postgres terminal once connected with psql
:
psql (12.14 (Ubuntu 12.14-0ubuntu0.20.04.1), server 12.11)
SSL connection (protocol: TLSv1.2, cipher: <redacted>-SHA384, bits: 256, compression: off)
psycopg2
python script
This works both from my machine and on AWS Batch.
SQLAlchemy
python script
This only works on my machine, and fails on AWS Batch: sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) FATAL: PAM authentication failed for user "my_user"
Find below the python script I am using to demonstrate this behaviour:
import boto3
import os
import psycopg2
from sqlalchemy import create_engine, text
from sqlalchemy.orm import Session
os.environ["AWS_DEFAULT_REGION"] = "us-east-1"
# config
ENDPOINT = "redacted.redacted.us-east-1.rds.amazonaws.com"
PORT = 5432
USER = "myuser"
REGION = "us-east-1"
DBNAME = "mydb"
# generate the token
session = boto3.Session()
client = session.client('rds')
token = client.generate_db_auth_token(DBHostname=ENDPOINT, Port=PORT, DBUsername=USER, Region=REGION)
# Try psycopg2
conn = psycopg2.connect(host=ENDPOINT, port=PORT, database=DBNAME, user=USER, password=token)
cur = conn.cursor()
cur.execute("""SELECT * FROM my_table""")
query_results_psycopg2 = cur.fetchall()
# Try SQLAlchemy
engine_url = f"postgresql+psycopg2://{USER}:{token}@{ENDPOINT}:{PORT}/{DBNAME}?sslmode=require"
db = create_engine(engine_url,
connect_args={
'sslmode': 'require',
})
with Session(db) as session:
query_results_sqlalchemy = session.execute(statement=text("SELECT * FROM my_table")).fetchall()
# this evaluates to true only locally
assert query_results_psycopg2 == query_results_sqlalchemy
This is the output of pip freeze
:
aiobotocore==2.4.2
aiohttp==3.8.4
aiohttp-retry==2.8.3
aioitertools==0.11.0
aiosignal==1.3.1
amqp==5.1.1
antlr4-python3-runtime==4.9.3
appdirs==1.4.4
async-timeout==4.0.2
asyncssh==2.13.1
atpublic==3.1.1
attrs==22.2.0
audioread==2.1.8
awscli==1.18.223
billiard==3.6.4.0
boto3==1.24.59
botocore==1.27.59
celery==5.2.7
certifi==2022.12.7
cffi==1.15.1
charset-normalizer==3.1.0
click==8.1.3
click-didyoumean==0.3.0
click-plugins==1.1.1
click-repl==0.2.0
colorama==0.4.3
configobj==5.0.8
cryptography==40.0.1
cycler==0.11.0
Cython==0.29.23
dataclasses-json==0.5.7
decorator==5.1.1
Deprecated==1.2.13
dictdiffer==0.9.0
diskcache==5.4.0
distro==1.8.0
docutils==0.15.2
dpath==2.1.5
dulwich==0.21.3
dvc==2.50.0
dvc-data==0.44.1
dvc-http==2.30.2
dvc-objects==0.21.1
dvc-render==0.3.1
dvc-s3==2.21.0
dvc-studio-client==0.6.1
dvc-task==0.2.0
exceptiongroup==1.1.1
filelock==3.10.7
flatten-dict==0.4.2
flufl.lock==7.1.1
frozenlist==1.3.3
fsspec==2023.3.0
funcy==2.0
gitdb==4.0.10
GitPython==3.1.31
grandalf==0.8
greenlet==2.0.2
hmmlearn==0.2.6
hydra-core==1.3.2
idna==3.4
importlib-resources==5.12.0
iniconfig==2.0.0
iterative-telemetry==0.0.8
jmespath==0.10.0
joblib==1.2.0
jsonpatch==1.32
jsonpointer==2.3
jsonschema==4.17.3
kiwisolver==1.4.4
kombu==5.2.4
librosa==0.7.0
llvmlite==0.31.0
madmom==0.16.1
markdown-it-py==2.2.0
marshmallow==3.19.0
marshmallow-enum==1.5.1
marshmallow-sqlalchemy==0.29.0
matplotlib==3.3.2
mdurl==0.1.2
mido==1.2.10
multidict==6.0.4
mypy-extensions==1.0.0
nanotime==0.5.2
networkx==3.0
nptyping==1.4.4
numba==0.48.0
numpy==1.19.2
omegaconf==2.3.0
orjson==3.8.9
packaging==23.0
pandas==1.4.4
pathlib==1.0.1
pathspec==0.11.1
Pillow==9.4.0
pkgutil_resolve_name==1.3.10
platformdirs==3.2.0
pluggy==1.0.0
prompt-toolkit==3.0.38
psutil==5.9.4
psycopg2==2.9.5
py==1.11.0
pyasn1==0.4.8
pycparser==2.21
pydantic==1.10.5
pydot==1.4.2
pydub==0.25.1
pygit2==1.11.1
PyGithub==1.58.1
Pygments==2.14.0
pygtrie==2.5.0
PyJWT==2.6.0
PyNaCl==1.5.0
pyparsing==3.0.9
pyrsistent==0.19.3
pytest==7.2.2
python-dateutil==2.8.2
pytz==2023.3
PyYAML==5.3.1
requests==2.28.2
resampy==0.3.1
rich==13.3.3
rsa==4.5
ruamel.yaml==0.17.21
ruamel.yaml.clib==0.2.7
s3fs==2023.3.0
s3transfer==0.6.0
scikit-learn==0.23.2
scipy==1.5.2
scmrepo==0.1.17
shortuuid==1.0.11
shtab==1.5.8
six==1.16.0
smmap==5.0.0
SoundFile==0.10.3.post1
sox==1.4.1
SQLAlchemy==2.0.6
sqltrie==0.3.0
tabulate==0.9.0
threadpoolctl==3.1.0
tomli==2.0.1
tomlkit==0.11.7
tqdm==4.65.0
typing-inspect==0.8.0
typing_extensions==4.5.0
typish==1.9.3
urllib3==1.26.15
vine==5.0.0
voluptuous==0.13.1
warlock==2.0.1
wcwidth==0.2.6
wrapt==1.15.0
yarl==1.8.2
zc.lockfile==3.0.post1
zipp==3.15.0
psql
version on Docker:
root@94af66abe2d2:/usr/src/app# psql --version
psql (PostgreSQL) 13.9 (Debian 13.9-0+deb11u1)
I am trying to understand why connection is successful locally using both psycopg2
and sqlalchemy
, and why sqlalchemy
produces the authentication error on AWS Batch.
They both share the same token, they both live in the same docker environment. Could it be a problem in the sqlalchemy
connection string? How come then it works just fine locally?
Any ideas would be much appreciated!