When I run this Spark code in Scala:
df.withColumn(x, when(col(x).isin(values:_*),col(x)).otherwise(lit(null).cast(StringType)))
I face this Error:
java.lang.RuntimeException: Compiling "GeneratedClass": Code of method
"apply(Lorg/apache/spark/sql/catalyst/InternalRow;)Lorg/apache/spark/sql
/catalyst /expressions/UnsafeRow;" of class
"org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection"
grows beyond 64 KB
at org.codehaus.janino.UnitCompiler.compileUnit(UnitCompiler.java:361)
at org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:234)
df: Spark Dataset
x: StringType column, each row something like "US,Washington,Seattle"
values: Array[String]