Writing from 2 separate Spark processes into 1 HDFS directory

Asked Oct 14 '18 at 15:24

Active Oct 14 '18 at 15:24

Viewed 108 times

I was wondering whether I can write from 2 separate Spark processes into 1 HDFS directory. Will there be a file names collision in this case. Files are written in form of 'part-00000-00c0472e-a01e-4ea6-b247-57114107c762.c000.txt'. Is there a chance that 2 separate Spark processes generate identical file names and one overwrites files of the other?

asked Oct 14 '18 at 15:24

sparker

See https://stackoverflow.com/questions/38964736/multiple-spark-jobs-appending-parquet-data-to-same-base-path-with-partitioning Looks like aspects tk consider. – thebluephantom Oct 14 '18 at 15:57
Yes, indeed. Thanks for pointing this out – sparker Oct 15 '18 at 05:57

Writing from 2 separate Spark processes into 1 HDFS directory

0 Answers0