I wrote the following SQL query in Jupyter notebook using Pyspark session -
MySparkSession.sql('''
select ID
, count(distinct transaction) as Txn_count
, sum(revenue) as Total_sales
, count(distinct product) as Total_products
from merge_table
where ( DATE between '2021-02-01' and '2021-03-31')
and (BRAND_NAME ='ADIDAS')
''').show()
I need to pass the DATE and BRAND_NAME columns as parameters and have no idea how to do it so that only by changing the BRAND_NAME and DATE, I can get filtered data.
any help is appreciated.