I have issue that if I create UDF function with multiple parameters I get weird exception and I realise that if I pass string param pyspark is trying to find column with that name regardless of function I'm calling :/
# rec is DataFrame
def findValue(records, name):
for item in records:
if item.key == name:
return item.value
return None
findValueUDF = udf(lambda x,y: findValue(x,y), StringType())
rec = rec.withColumn("name", findValueUDF(from_json(rec.customfields, schema), "name"))
I'm getting error
Exception: cannot resolve '`name`' given input columns: [zipcode, customer_id, customfields, timestamp, essential_item];;