I am running a very simple query using python on a table from Snowflake database using the package snowflake-connector-python==2.3.3
installed with the additional [pandas]
I containerized my python app using the python:3.7.0-slim
image. And my script is extremely simple.
from snowflake import connector
import os
ctx = connector.connect(
user=os.environ['USER'],
password=os.environ['PASSWORD'],
account=os.environ['ACCOUNT'],
warehouse=os.environ['WAREHOUSE'],
database=os.environ['DATABASE'],
schema=os.environ['SCHEMA'])
cur = ctx.cursor()
# Execute a statement that will generate a result set.
sql = "SELECT * FROM MY_TABLE ORDER BY MY_COLUMN"
print("executing query: " + sql)
cur.execute(sql)
df = cur.fetch_pandas_all()
The actual table size from what Snowflake tells me is 3.3 GB. However when I run this app it crashes as it takes over 9GB of RAM. I know this because I'm running it in a kubernetes cluster and the pod is evicted and says it used 9535336Ki
memory. Is there something I'm missing here? How can the memory usage be 3x the table size?