With an RDD, I can output rdd.saveAsTextFile('directory')
which saves the file in hdfs://directory
. Can the text file be save directly to a directory on the local filesystem (i.e. directory
)?
Asked
Active
Viewed 429 times
1

cshin9
- 1,440
- 5
- 20
- 33
-
2Possible duplicate of [Save a spark RDD to the local file system using Java](http://stackoverflow.com/questions/31239161/save-a-spark-rdd-to-the-local-file-system-using-java) – DNA May 18 '16 at 22:09
1 Answers
1
Of course you can... since the saveAsTextFile('directory') will save as many files as your partitioners, you first neeed to merge the files before you copy to local (unless you wish to copy each file into local). Therefore first call
FileUtil.copyMerge(sourceFileSystem, new Path(sourceFullPath), destFileSystem, new Path(destinationFullPath), true, sparkContext.hadoopConfiguration, null)
and afterwards use
FileSystem fs = FileSystem.get(yourConfiguration)
fs.copyToLocalFile(true, destinationFullPath, localFilePath)

Felix
- 140
- 10