1

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2.0 failed 4 times, most recent failure: Lost task 0.3 in stage 2.0 (TID 5, svr17933hw2288.hadoop.sh.ctripcorp.com, executor 1): org.apache.spark.SparkException: Failed to execute user defined function($anonfun$createTransformFunc$1: (string) => array)

The code is as follows:

val tokenizer = new Tokenizer().setInputCol("sendcontent").setOutputCol("words")
var wordsData = tokenizer.transform(sourDF)
val hashingTF = new HashingTF()
       .setInputCol("words").setOutputCol("rawFeatures").setNumFeatures(20)
val featurizedData = hashingTF.transform(wordsData)
val idf = new IDF().setInputCol("rawFeatures").setOutputCol("features")
val idfModel = idf.fit(featurizedData)
val rescaledData = idfModel.transform(featurizedData)
rescaledData.select("features", "msgid").take(3).foreach(println)
Alper t. Turker
  • 34,230
  • 9
  • 83
  • 115
heitaoq
  • 11
  • 4
  • Please check [How to make good reproducible Apache Spark Dataframe examples](https://stackoverflow.com/q/48427185/9613318) and include full traceback. In the current state this question is not answerable. – Alper t. Turker May 10 '18 at 08:15
  • And the same comment [here](https://stackoverflow.com/q/50264426/9613318) (I cannot say if it is duplicate or not). – Alper t. Turker May 10 '18 at 08:25
  • Did you manage to solve the problem? – lu5er Nov 19 '18 at 07:19

0 Answers0