0

I want to use a FUNC to map a line data from a HDFS , then cover to a DataFrame, but it doesn't work. Please help me as soon as possible . For example :

case class Kof(UID: String, SITEID: String, MANAGERID: String, ROLES: String, EXTERNALURL: String, EXTERNALID: String, OPTION1: String,
           OPTION2: String, OPTION3: String
          )

  def GetData(argv1: Array[String]): Kof =
  {
   return Kof(argv1(0), argv1(1),argv1(2), argv1(3),argv1(4), 
   argv1(5),argv1(6), argv1(7),argv1(8)) }


val textFile2 = sc.textFile("hdfs://hadoop-s3:8020/tmp/mefang/modify.txt").
                map(_.split(",")).map(p => {GetData(p)})**toDF** <!-here it break error ->

Exception in thread "main" org.apache.spark.SparkException: Task not serializable

Kof
  • 65
  • 2
  • 5
  • Possible duplicate of [Task not serializable: java.io.NotSerializableException when calling function outside closure only on classes not objects](http://stackoverflow.com/questions/22592811/task-not-serializable-java-io-notserializableexception-when-calling-function-ou) – Tzach Zohar Oct 14 '16 at 03:46
  • what is your mean? can you give me an example ,thank you. – Kof Oct 14 '16 at 05:42

0 Answers0