I have a spark job running on AWS EMR cluster, it need access native lib(*.so), per spark's document (https://spark.apache.org/docs/2.3.0/configuration.html) I need add "spark.driver.extraLibraryPath" and "spark.executor.extraLibraryPath" options in spark-submit command line
spark-submit \
--class test.Clustering \
--conf spark.executor.extraLibraryPath="/opt/test/lib/native" \
--conf spark.driver.extraLibraryPath="/opt/test/lib/native" \
--master yarn \
--deploy-mode client \
s3-etl-prepare-1.0-SNAPSHOT-jar-with-dependencies.jar "$@"
It works as I expected, native lib is loaded, the problem is: during spark job I need doing a distribute lzo indexer MR job which need lzo native library, the lzo code could not load the native gpl library:
21/06/16 09:49:09 ERROR GPLNativeCodeLoader: Could not load native gpl library
java.lang.UnsatisfiedLinkError: no gplcompression in java.library.path
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1860)
at java.lang.Runtime.loadLibrary0(Runtime.java:870)
at java.lang.System.loadLibrary(System.java:1124)
at com.hadoop.compression.lzo.GPLNativeCodeLoader.<clinit>(GPLNativeCodeLoader.java:32)
at com.hadoop.compression.lzo.LzoCodec.<clinit>(LzoCodec.java:71)
at com.hadoop.compression.lzo.DistributedLzoIndexer.<init>(DistributedLzoIndexer.java:28)
at test.misc.FileHelper.distributIndexLzoFile(FileHelper.scala:260)
at test.scalaapp.Clustering$.main(Clustering.scala:66)
at test.scalaapp.Clustering.main(Clustering.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:853)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:928)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:937)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
it seems "spark.driver.extraLibraryPath" option override or change the whole library path rather than append a new one, how can I keep both gpl lzo native path and my own library path?