I try to join 2 hive tables, omega and card, as follows:
table omega:
+------+--------+-------+-----+-----+
|pid |enventid|card_id|count|name |
+------+--------+-------+-----+-----+
|111111|"sk" |"pro" |2 |"aaa"|
|222222|"sk" |"pro" |2 |"ddd"|
+------+--------+-------+-----+-----+
table card:
+-------+---------+
|card_id|card_desc|
+-------+---------+
|"pro" |"1|2|3" |
+-------+---------+
then I defined a udf:
val getListUdf = udf((raw: String) => raw.split("|"))
now,i try to join 2 tables with the defined udf:
omega.join(card, Seq("card_id"), "left_outer").withColumn("card_desc", getListUdf(col("card_desc")))
but, I got these errors:
Caused by: java.lang.NullPointerException
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:25)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:25)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source)
at org.apache.spark.sql.execution.Project$$anonfun$1$$anonfun$apply$1.apply(basicOperators.scala:51)
at org.apache.spark.sql.execution.Project$$anonfun$1$$anonfun$apply$1.apply(basicOperators.scala:49)
......
How should i solve it? Who can help me? thanks