0

I feel this should be a really obvious question, but is there any documentation on the 'options' for writing Spark DataFrames?

I'm trying to follow the advice in How do you control the size of the output file?, but I'm not getting what I expect. My Scala code is:

myDataDataFrame.write
  .option("maxRecordsPerFile", calculatedMaxRecordsPerFile)
  .mode(SaveMode.Overwrite)
  .parquet(targetPath)

no matter how I vary calculatedMaxRecordsPerFile the I always get the same sized files...

I suspect there is some other options that I need to set, but I can't find any documentation which describes what all the options are.

And before anybody asks, Yes I have done a Google* search to try and find this information. All of the top placed results point back to stackoverflow!

(*other search engines are available)

Stormcloud
  • 2,065
  • 2
  • 21
  • 41
  • 1
    Does this answer your question? [Where is the reference for options for writing or reading per format?](https://stackoverflow.com/questions/44365042/where-is-the-reference-for-options-for-writing-or-reading-per-format) – notNull Jul 06 '20 at 17:16
  • Try setting it in spark session itself as spark.conf.set("spark.sql.files.maxRecordsPerFile", calculatedMaxRecordsPerFile ). – Gopal Tiwari Jul 07 '20 at 09:03
  • @Shu, sort of - it basically says there is no documentation (which might explain why I can't find any) and go figure it out for your self for the source! It's a bit of a non-answer - Life is too short to spend reading and understanding all the source code for all libraries I use. I'd accept an answer that contains a list of possible option strings without full descriptions. – Stormcloud Jul 07 '20 at 09:13
  • @ Gopal Tiwari, thanks for your suggestion; It didn't work, *But* it was enough for me to realise what undelying the problem was; I'm tied to an older version of Spark in which the option isn't supported. Spark appears to accept any option string and silently ignores invalid strings. This just makes me want documentation more! – Stormcloud Jul 08 '20 at 13:34

0 Answers0