Having a perplexing issue when I run a spark application via a deployed jar (built by maven shade plugin) in non-local environments.
java.lang.RuntimeException: java.lang.ClassNotFoundException: org.postgresql.ds.PGSimpleDataSource
at com.zaxxer.hikari.util.UtilityElf.createInstance(UtilityElf.java:96)
at com.zaxxer.hikari.pool.PoolBase.initializeDataSource(PoolBase.java:314)
at com.zaxxer.hikari.pool.PoolBase.<init>(PoolBase.java:108)
at com.zaxxer.hikari.pool.HikariPool.<init>(HikariPool.java:105)
at com.zaxxer.hikari.HikariDataSource.<init>(HikariDataSource.java:72)
at mypackage.SansORMProvider.get(SansORMProvider.java:42)
at mypackage.MySansORMProvider.get(MySansORMProvider.scala:15)
at mypackage.MyApp$.main(MyApp.scala:63)
at mypackage.MyApp.main(MyApp.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:680)
Caused by: java.lang.ClassNotFoundException: org.postgresql.ds.PGSimpleDataSource
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
at com.zaxxer.hikari.util.UtilityElf.createInstance(UtilityElf.java:83)
... 13 more
The reason this is perplexing is because the following is in my pom.xml:
<dependency>
<groupId>org.postgresql</groupId>
<artifactId>postgresql</artifactId>
<scope>compile</scope>
</dependency>
The shade plugin has no configurations referencing this postgres dependency or any pattern that would match it.
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<executions>
<execution>
<goals>
<goal>shade</goal>
</goals>
<phase>package</phase>
<configuration>
<artifactSet>
<excludes combine.children="append">
<exclude>org.apache.spark:*:*</exclude>
<exclude>org.apache.hadoop:*:*</exclude>
<exclude>org.slf4j:*</exclude>
</excludes>
</artifactSet>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
<relocations>
<relocation>
<pattern>com.google.common</pattern>
<shadedPattern>${project.groupId}.google.common</shadedPattern>
</relocation>
<relocation>
<pattern>io.netty</pattern>
<shadedPattern>${project.groupId}.io.netty</shadedPattern>
</relocation>
<relocation>
<pattern>okhttp3</pattern>
<shadedPattern>${project.groupId}.okhttp3</shadedPattern>
</relocation>
<relocation>
<pattern>com.fasterxml.jackson</pattern>
<shadedPattern>${project.groupId}.fasterxml.jackson</shadedPattern>
</relocation>
]
</relocations>
<shadedArtifactAttached>true</shadedArtifactAttached>
</configuration>
</execution>
</executions>
</plugin>
Spark dependencies (as requested):
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.4.3</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_2.11</artifactId>
<version>2.4.3</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.4.3</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-tags_2.11</artifactId>
<version>2.4.3</version>
<scope>provided</scope>
</dependency>
In the output from the maven command building the jar, I can see [INFO] Including org.postgresql:postgresql:jar:42.2.1 in the shaded jar.
And when I run jar tvf myShadedJar.jar | grep postgres
I can see the missing class.
One weird thing that may be relevant is when I actually unzip the jar with jar xf
theres no org/postgresql folder. Yet, when i unzip
the jar it's there.
What might be the problem? How do I confirm it? And is it expected that the exploded jar is missing the org/postgresql folder?