4

I'm having an Array[String] called samparr with some values in it, I want it to get stored as an output file.

var samparr: Array[String] = new Array[String](4)
samparr +:= print1 + "  BEST_MATCH  " + print2

just like,

val output = samparr.saveAsTextFile(outputpath)

but isn't a RDD its an Array[String]

Vickyster
  • 163
  • 3
  • 5
  • 18

1 Answers1

6

You can use SparkContext.parallelize to "distribute" your Array onto the Spark cluster (in other words, to turn it into an RDD), and then call saveAsTextFile:

sc.parallelize(samparr).saveAsTextFile(outputpath)

This action will partition the data and send each partition to one of the executors, then each partition will be saved into a separate "file-part".

Alternatively, since the array is very small and doesn't really "justify" using Spark, you can try any non-Spark method of saving data to file, e.g. the one linked by @avihoo-mamka: How to write to a file in Scala?

Community
  • 1
  • 1
Tzach Zohar
  • 37,442
  • 3
  • 79
  • 85