For
val rdd = sc.textFile("file.txt")
where file.txt
includes
Some Informative Header
value1, value11
value2, value22
how to partition the rdd
into
Some Informative Header
value1, value11
and
Some Informative Header
value2, value22
so that I can run rdd.pipe("/bin/awesomeApp")
on each partition?
Note Eventually my awesomeApp
needs as the very first entry the Some Informative Header
, the rest of entries may be computed in parallel.