I have looked all over for an answer to this and tried everything. Nothing seems to work. I'm trying to reference a variable assignment within a spark.sql query in python. Running python 3 and spark version 2.3.1.
bkt = 1
prime = spark.sql(s"SELECT ((year(fdr_date))*100)+month(fdr_date) as fdr_year, count(*) as counts\
FROM pwrcrv_tmp\
where EXTR_CURR_NUM_CYC_DLQ=$bkt\
and EXTR_ACCOUNT_TYPE in('PS','PT','PD','PC','HV','PA')\
group by ((year(fdr_date))*100)+month(fdr_date)\
order by ((year(fdr_date))*100)+month(fdr_date)")
prime.show(50)
The error:
prime = spark.sql(s"SELECT ((year(fdr_date))*100)+month(fdr_date) as fdr_year, count(*) as counts FROM pwrcrv_tmp where EXTR_CURR_NUM_CYC_DLQ=$bkt and EXTR_ACCOUNT_TYPE in('PS','PT','PD','PC','HV','PA') group by ((year(fdr_date))*100)+month(fdr_date) order by ((year(fdr_date))*100)+month(fdr_date)")
^
SyntaxError: invalid syntax