0

Background information:

  • I am connecting to an AWS RDS database
  • The code resides within a docker container
  • I am running the docker container locally and on AWS Batch
  • For connecting locally, I have my own IAM user credentials
  • For connecting on AWS Batch, the pod assumes an IAM role

I have been using three different ways to connect to the database both locally (from within docker container) and on AWS batch (from within the pod).

psql binary:

export AWS_PROFILE="{YOUR_PROFILE}"
export RDSHOST="redacted.redacted.us-east-1.rds.amazonaws.com" export PGPASSWORD="$(aws rds generate-db-auth-token --hostname $RDSHOST --port 5432 --region us-east-1 --username my user)"
psql "host=$RDSHOST port=5432 dbname=mydb user=myuser password=$PGPASSWORD"

This works both from my machine and on AWS Batch. Postgres terminal once connected with psql:

psql (12.14 (Ubuntu 12.14-0ubuntu0.20.04.1), server 12.11)
SSL connection (protocol: TLSv1.2, cipher: <redacted>-SHA384, bits: 256, compression: off)

psycopg2 python script

This works both from my machine and on AWS Batch.

SQLAlchemy python script

This only works on my machine, and fails on AWS Batch: sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) FATAL: PAM authentication failed for user "my_user"

Find below the python script I am using to demonstrate this behaviour:

import boto3  
import os  
import psycopg2  
from sqlalchemy import create_engine, text  
from sqlalchemy.orm import Session  
  
os.environ["AWS_DEFAULT_REGION"] = "us-east-1"  
  
# config  
ENDPOINT = "redacted.redacted.us-east-1.rds.amazonaws.com"  
PORT = 5432  
USER = "myuser"  
REGION = "us-east-1"  
DBNAME = "mydb"  
  
# generate the token  
session = boto3.Session()  
client = session.client('rds')  
token = client.generate_db_auth_token(DBHostname=ENDPOINT, Port=PORT, DBUsername=USER, Region=REGION)  
  
# Try psycopg2  
conn = psycopg2.connect(host=ENDPOINT, port=PORT, database=DBNAME, user=USER, password=token)  
cur = conn.cursor()  
cur.execute("""SELECT * FROM my_table""")  
query_results_psycopg2 = cur.fetchall()  
  
# Try SQLAlchemy  
engine_url = f"postgresql+psycopg2://{USER}:{token}@{ENDPOINT}:{PORT}/{DBNAME}?sslmode=require"  
db = create_engine(engine_url,  
                   connect_args={  
                       'sslmode': 'require',  
                   })  
with Session(db) as session:  
    query_results_sqlalchemy = session.execute(statement=text("SELECT * FROM my_table")).fetchall()  

# this evaluates to true only locally
assert query_results_psycopg2 == query_results_sqlalchemy

This is the output of pip freeze:

aiobotocore==2.4.2
aiohttp==3.8.4
aiohttp-retry==2.8.3
aioitertools==0.11.0
aiosignal==1.3.1
amqp==5.1.1
antlr4-python3-runtime==4.9.3
appdirs==1.4.4
async-timeout==4.0.2
asyncssh==2.13.1
atpublic==3.1.1
attrs==22.2.0
audioread==2.1.8
awscli==1.18.223
billiard==3.6.4.0
boto3==1.24.59
botocore==1.27.59
celery==5.2.7
certifi==2022.12.7
cffi==1.15.1
charset-normalizer==3.1.0
click==8.1.3
click-didyoumean==0.3.0
click-plugins==1.1.1
click-repl==0.2.0
colorama==0.4.3
configobj==5.0.8
cryptography==40.0.1
cycler==0.11.0
Cython==0.29.23
dataclasses-json==0.5.7
decorator==5.1.1
Deprecated==1.2.13
dictdiffer==0.9.0
diskcache==5.4.0
distro==1.8.0
docutils==0.15.2
dpath==2.1.5
dulwich==0.21.3
dvc==2.50.0
dvc-data==0.44.1
dvc-http==2.30.2
dvc-objects==0.21.1
dvc-render==0.3.1
dvc-s3==2.21.0
dvc-studio-client==0.6.1
dvc-task==0.2.0
exceptiongroup==1.1.1
filelock==3.10.7
flatten-dict==0.4.2
flufl.lock==7.1.1
frozenlist==1.3.3
fsspec==2023.3.0
funcy==2.0
gitdb==4.0.10
GitPython==3.1.31
grandalf==0.8
greenlet==2.0.2
hmmlearn==0.2.6
hydra-core==1.3.2
idna==3.4
importlib-resources==5.12.0
iniconfig==2.0.0
iterative-telemetry==0.0.8
jmespath==0.10.0
joblib==1.2.0
jsonpatch==1.32
jsonpointer==2.3
jsonschema==4.17.3
kiwisolver==1.4.4
kombu==5.2.4
librosa==0.7.0
llvmlite==0.31.0
madmom==0.16.1
markdown-it-py==2.2.0
marshmallow==3.19.0
marshmallow-enum==1.5.1
marshmallow-sqlalchemy==0.29.0
matplotlib==3.3.2
mdurl==0.1.2
mido==1.2.10
multidict==6.0.4
mypy-extensions==1.0.0
nanotime==0.5.2
networkx==3.0
nptyping==1.4.4
numba==0.48.0
numpy==1.19.2
omegaconf==2.3.0
orjson==3.8.9
packaging==23.0
pandas==1.4.4
pathlib==1.0.1
pathspec==0.11.1
Pillow==9.4.0
pkgutil_resolve_name==1.3.10
platformdirs==3.2.0
pluggy==1.0.0
prompt-toolkit==3.0.38
psutil==5.9.4
psycopg2==2.9.5
py==1.11.0
pyasn1==0.4.8
pycparser==2.21
pydantic==1.10.5
pydot==1.4.2
pydub==0.25.1
pygit2==1.11.1
PyGithub==1.58.1
Pygments==2.14.0
pygtrie==2.5.0
PyJWT==2.6.0
PyNaCl==1.5.0
pyparsing==3.0.9
pyrsistent==0.19.3
pytest==7.2.2
python-dateutil==2.8.2
pytz==2023.3
PyYAML==5.3.1
requests==2.28.2
resampy==0.3.1
rich==13.3.3
rsa==4.5
ruamel.yaml==0.17.21
ruamel.yaml.clib==0.2.7
s3fs==2023.3.0
s3transfer==0.6.0
scikit-learn==0.23.2
scipy==1.5.2
scmrepo==0.1.17
shortuuid==1.0.11
shtab==1.5.8
six==1.16.0
smmap==5.0.0
SoundFile==0.10.3.post1
sox==1.4.1
SQLAlchemy==2.0.6
sqltrie==0.3.0
tabulate==0.9.0
threadpoolctl==3.1.0
tomli==2.0.1
tomlkit==0.11.7
tqdm==4.65.0
typing-inspect==0.8.0
typing_extensions==4.5.0
typish==1.9.3
urllib3==1.26.15
vine==5.0.0
voluptuous==0.13.1
warlock==2.0.1
wcwidth==0.2.6
wrapt==1.15.0
yarl==1.8.2
zc.lockfile==3.0.post1
zipp==3.15.0

psql version on Docker:

root@94af66abe2d2:/usr/src/app# psql --version
psql (PostgreSQL) 13.9 (Debian 13.9-0+deb11u1)

I am trying to understand why connection is successful locally using both psycopg2 and sqlalchemy , and why sqlalchemy produces the authentication error on AWS Batch.

They both share the same token, they both live in the same docker environment. Could it be a problem in the sqlalchemy connection string? How come then it works just fine locally?

Any ideas would be much appreciated!

sigma
  • 216
  • 1
  • 4
  • 1) Why `...{DBNAME}?sslmode=require"` and `connect_args={ 'sslmode': 'require', }`. 2) Look at accepted answer here [SQLAlchemy PAM failure](https://stackoverflow.com/questions/53151618/python-sqlalchemy-authentication-failed-for-user). – Adrian Klaver Apr 05 '23 at 18:11

1 Answers1

0

I have found a tweak that solved the problem. Instead of generating the URL via

engine_url = f"postgresql+psycopg2://{USER}:{token}@{ENDPOINT}:{PORT}/{DBNAME}"

I generated it using

engine_url = sqlalchemy.engine.url.URL.create(
            drivername='postgresql+psycopg2',
            username=USER,
            password=token,
            host=HOST,
            port=PORT,
            database=DBNAME
        )

I am assuming that the reason the "manual" URL assembly worked locally, is that the user IAM token is different to the assumed role IAM token, in that it contains certain special characters that need to be parsed before used in the connection string.

sigma
  • 216
  • 1
  • 4