The following code is trying to parse files but, it would constantly raise exceptions when I try to access elements in the RDD.
val raw_data = sc.textFile(path).map(_.split(",")).mapPartitions(_.drop(1)).filter(_.size > 4).map(s=>s) //remove header and empty entries
raw_data.count
val raw_by_user: RDD[(String, Iterable[Array[String]])] = raw_data.map{s =>
if(s.size > 3)
(s(0), Array[String](s(0),toStandarddate(s(2)),toEntryExit(s(3)),s(5),s(6) ,jr_type,"TST_0", stationMap(s(5)),stationMap(s(6))))
else{
println(s(0) , s.mkString(","))
(s(0) , Array[String]())
}
}.groupByKey()
raw_by_user.count
Error :
16/01/05 13:39:30 ERROR Executor: Exception in task 0.0 in stage 2.0 (TID 4) java.util.NoSuchElementException: key not found: 2 at scala.collection.MapLike$class.default(MapLike.scala:228) at scala.collection.AbstractMap.default(Map.scala:58) at scala.collection.mutable.HashMap.apply(HashMap.scala:64) at DataCreation.ProcessData$$anonfun$9.apply(package.scala:77) at DataCreation.ProcessData$$anonfun$9.apply(package.scala:75) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
Any Ideas what could be possible problems ? and how to handle exceptions ?