I have written below function in pyspark to get deptid and return a dataframe which i want to use in spark sql .
def get_max_salary(deptid):
sql_salary="select max(salary) from empoyee where depid ={}"
df_salary = spark.sql(sql_salary.format(deptid)) return df_salary spark.udf.register('get_max_salary',get_max_salary)
However i get below error message . I searched online but i couldnt find a proper solution anywhere . could someone please help me here
Error Message - PicklingError: Could not serialize object: Exception: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transformation. SparkContext can only be used on the driver, not in code that it run on workers. For more information, see SPARK-5063.