1

I am building a jar which has application.conf under src/main/resources folder. However, I am trying to overwrite that while doing spark-submit. However it's not working.

following is my command

$spark_submit $spark_params $hbase_params \
    --class com.abc.xyz.MYClass \
    --files application.conf \
    $sandbox_jar flagFile/test.FLG \
    --conf "spark.executor.extraClassPath=-Dconfig.file=application.conf"

application.conf - is located in same directory my jar file is.

Gaurang Shah
  • 11,764
  • 9
  • 74
  • 137

2 Answers2

0

-Dconfig.file=path/to/config-file mayn't work due to internal cache on ConfigFactory. The documentation suggest to run ConfigFactory.invalidateCaches().

The other way is following, which merges the supplied properties with existing properties available.

ConfigFactory.invalidateCaches()
val c = ConfigFactory.parseFile(new File(path-to-file + "/" + "application.conf"))
val config : Config = c.withFallback(ConfigFactory.load()).resolve

I think the best way to override the properties would be to supply them using -D. Typesafe gives highest priority to system properties, so -D will override reference.conf and application.conf.

Salim
  • 2,046
  • 12
  • 13
  • not able to understand what you are trying to say. I have already provided that option. but it's not working – Gaurang Shah Apr 23 '20 at 21:39
  • I missed your last line of code. Sorry abt that. I had similar issue and I loaded the file and merged with Config object. Please see modified answer. – Salim Apr 23 '20 at 21:53
  • could you expain who to handle `new File` to read from the parameter passed through maven – Gaurang Shah Apr 27 '20 at 18:19
  • @Gaurang, one way is to keep application.conf in the application root (same directory as the jar). Then file path becomes `System.getProperty("user.dir")` – Salim Apr 27 '20 at 20:27
  • I mean.. when I pass it as maven parameter. how will it read the path here. – Gaurang Shah Apr 28 '20 at 03:09
  • You can pass the path as -D into the Scala application and use my code to read the config file from that path. Here is an example of passing -D from maven - https://stackoverflow.com/questions/10108374/maven-how-to-run-a-java-file-from-command-line-passing-arguments/10108780 – Salim Apr 28 '20 at 22:59
0

Considering application.conf is properties file. There is other option, which can solve the same purpose of using properties file.

Not sure but packaging properties file with jar might not provide flexibility? Here keeping properties file separate from jar packaging, this will provide flexibility as, whenever if any property changes just replace new properties file instead of building and deploying whole jar.

This can be achieved as, keep your properties in properties file are prefix your property key with "spark."

spark.inputpath /input/path
spark.outputpath /output/path

Spark Submit command would be like,

$spark_submit $spark_params $hbase_params \
    --class com.abc.xyz.MYClass \
    --properties-file application.conf \
    $sandbox_jar flagFile/test.FLG 

Getting properties in code like,

sc.getConf.get("spark.inputpath")       // /input/path
sc.getConf.get("spark.outputpath")      // /output/path

Not nessesary it will solve your problem though. But here just try to put another approach to work.

Ajay Kharade
  • 1,469
  • 1
  • 17
  • 31