I feel this should be a really obvious question, but is there any documentation on the 'options' for writing Spark DataFrames?
I'm trying to follow the advice in How do you control the size of the output file?, but I'm not getting what I expect. My Scala code is:
myDataDataFrame.write
.option("maxRecordsPerFile", calculatedMaxRecordsPerFile)
.mode(SaveMode.Overwrite)
.parquet(targetPath)
no matter how I vary calculatedMaxRecordsPerFile
the I always get the same sized files...
I suspect there is some other options that I need to set, but I can't find any documentation which describes what all the options are.
And before anybody asks, Yes I have done a Google* search to try and find this information. All of the top placed results point back to stackoverflow!
(*other search engines are available)