I'm running a job on a 2 machine 2.1.0 Spark cluster.
I'm trying to save a Dataframe to a CSV file (or multiple ones, it doesn't matter). When I use:
df.write
.options(options)
.csv(finalPath)
It successfully saves the data into csv files per partition. On one of my machine it creates the .csv files as part-XXXX files inside the directory I entered, which is great. But on the other machine, it creates a _temporary/0/ subdirectory inside the directory I entered and the files there are in the format task_XXXX, and this behaviour is less great.
Why does that happen? And is there a way for it to be written like in the first machine? Without creating the _temporary/0/ subdirectories?
Thanks in advance :)