I have created this basic stored procedure to query a Snowflake table based on a customer id:
CREATE OR REPLACE PROCEDURE SP_Snowpark_Python_Revenue_2(site_id STRING)
RETURNS STRING
LANGUAGE PYTHON
RUNTIME_VERSION = '3.8'
PACKAGES = ('snowflake-snowpark-python')
HANDLER = 'run'
AS
$$
from snowflake.snowpark.functions import *
def run(session, site_id):
df_rev_tmp = session.table("revenue").select(col("site_id"), col("subscription_id"), col("country_name"), col("product_name"))
df_rev_final = df_rev_tmp.filter(col("site_id") == site_id)
return "SUCCESS"
$$;
It works fine but I would like my sproc to return a JSON object for the whole result set. I modified it thusly:
CREATE OR REPLACE PROCEDURE SP_Snowpark_Python_Revenue_3(site_id STRING)
RETURNS STRING
LANGUAGE PYTHON
RUNTIME_VERSION = '3.8'
PACKAGES = ('snowflake-snowpark-python')
HANDLER = 'run'
AS
$$
from snowflake.snowpark.functions import *
def run(session, site_id):
df_rev = session.table("revenue").select(col("site_id"), col("subscription_id"), col("country_name"), col("product_name"))
df_rev_tmp = df_rev.filter(col("site_id") == site_id)
df_rev_final = df_rev_tmp.to_pandas()
df_rev_json = df_rev_final.to_json(orient = 'columns')
return df_rev_json
$$;
It compiles without errors but fails at runtime with this error:
CALL SP_Snowpark_Python_Revenue_3('dfgerr6223').....
255002: Optional dependency: 'pyarrow' is not installed...
What am I missing here?