How to apply scala uaparser to a column of a dataframe. Each row in the column in the dataframe is of the form -
Mozilla/5.0 (iPhone; CPU iPhone OS 5_1_1 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9B206 Safari/7534.48.3"
I am trying to something of the form -
def getTrnUiEvent(hiveDf:org.apache.spark.sql.DataFrame): Unit = {
val trnUiEventDf = hiveDf
.withColumn("application_browser_user_agent", getUAFamily(hiveDf("application_browser_user_agent")))}
val getUAFamily = udf((ua_string:org.apache.spark.sql.DataFrame) => {
Parser.get.parse(ua_string.toString()).userAgent.family})
I am receiving an error for the above. I have also tried other ways to do the above but am ending up the with same result. The thing I cant get my head around is how each row of the dataframe column can be processed by the uaparser. Each row of hiveDf("application_browser_user_agent") looks like the string example pasted above.
The links I have looked at - Applying function to Spark Dataframe Column
Do I convert this to an RDD first and then process each row of the RDD using the uaparser?
Link for uaparser - https://github.com/ua-parser/uap-scala
error message -
/Users/pojha/github/Bacon/scala/bacon/src/main/scala/baconParallel.scala:62:
No TypeTag available for String
[error] val getUAFamily = udf((ua_string:org.apache.spark.sql.DataFrame) => {