I use the next code:
csv.saveAsTextFile(pathToResults, classOf[GzipCodec])
pathToResults directory has many files like part-0000, part-0001 etc. I can use FileUtil.copyMerge(), but it's really slow, it's download all files on driver program and then upload them in hadoop. But FileUtil.copyMerge() faster than:
csv.repartition(1).saveAsTextFile(pathToResults, classOf[GzipCodec])
How can I merge spark results files without repartition and FileUtil.copyMerge()?