so this is what I have been trying and I'm a newbie here working with spark!
I'm trying to execute this code
val ii=sc.parallelize(Seq(("e.txt"),("r.txt"))).foreach{i => sc.textFile(i)}
but I'm getting "Nullpointer exception"
Thanks!
so this is what I have been trying and I'm a newbie here working with spark!
I'm trying to execute this code
val ii=sc.parallelize(Seq(("e.txt"),("r.txt"))).foreach{i => sc.textFile(i)}
but I'm getting "Nullpointer exception"
Thanks!
You can just add multiple files to the sc.textFile
. You should not use the sc
inside of a map operation. The map
function will be distributed to the different executors, and the sc
lives in the driver. Therefore it will throw a Nullpointer exception.
a.txt contents:
a.txt:line1
a.txt:line2
b.txt contents:
b.txt:line1
b.txt:line2
Spark allows you to add more files in the same operation:
scala> sc.textFile("a.txt,b.txt").collect()
res1: Array[String] = Array(a.txt:line1, a.txt:line2, b.txt:line1, b.txt:line2)
Hope this helps and have fun with Spark!