0

I have issue that if I create UDF function with multiple parameters I get weird exception and I realise that if I pass string param pyspark is trying to find column with that name regardless of function I'm calling :/

# rec is DataFrame
def findValue(records, name):
    for item in records:
        if item.key == name:
            return item.value
    return None
findValueUDF = udf(lambda x,y: findValue(x,y), StringType())
rec = rec.withColumn("name", findValueUDF(from_json(rec.customfields, schema), "name"))

I'm getting error

Exception: cannot resolve '`name`' given input columns: [zipcode, customer_id, customfields, timestamp, essential_item];;

0 Answers0