1

Here's the exception:

java.lang.ClassCastException: cannot assign instance of java.lang.invoke.SerializedLambda to ... of type org.apache.spark.sql.api.java.UDF2 in instance of ...

If I don't implement the UDF by Lambda expression, it's ok. Like:

private UDF2 funUdf = new UDF2<String, String, String>() {
    @Override
    public String call(String a, String b) throws Exception {
        return fun(a, b);
    }
};
dataset.sparkSession().udf().register("Fun", funUdf, DataTypes.StringType);
functions.callUDF("Fun", functions.col("a"), functions.col("b"));

I am running in local so this answer will not help: https://stackoverflow.com/a/28367602/4164722

Why? How can I fix it?

KARTHIKEYAN.A
  • 18,210
  • 6
  • 124
  • 133
secfree
  • 4,357
  • 2
  • 28
  • 36

1 Answers1

0

This is a working solution :

UDF1 myUDF = new UDF1<String, String>() {
            public String call(final String str) throws Exception {
                return str+"A";
            }
        };
    
sparkSession.udf().register("Fun", myUDF, DataTypes.StringType);

Dataset<Row> rst = sparkSession.read().format("text").load("myFile");

rst.withColumn("nameA",functions.callUDF("Fun",functions.col("name")))
raphaelauv
  • 670
  • 1
  • 11
  • 22