I have a runtime problem with a piece of code I'm running on top of Apache Spark. I depend on the AWS SDK to upload files to S3 - and this is erroring out with a NoSuchMethodError. It is worthwhile to note that I'm using an uber jar with the Spark dependency bundled in. Error when running my code:
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.http.impl.conn.DefaultClientConnectionOperator.<init>(Lorg/apache/http/conn/scheme/SchemeRegistry;Lorg/apache/http/conn/DnsResolver;)V
at org.apache.http.impl.conn.PoolingClientConnectionManager.createConnectionOperator(PoolingClientConnectionManager.java:140)
at org.apache.http.impl.conn.PoolingClientConnectionManager.<init>(PoolingClientConnectionManager.java:114)
at org.apache.http.impl.conn.PoolingClientConnectionManager.<init>(PoolingClientConnectionManager.java:99)
at com.amazonaws.http.ConnectionManagerFactory.createPoolingClientConnManager(ConnectionManagerFactory.java:29)
at com.amazonaws.http.HttpClientFactory.createHttpClient(HttpClientFactory.java:97)
at com.amazonaws.http.AmazonHttpClient.<init>(AmazonHttpClient.java:165)
at com.amazonaws.AmazonWebServiceClient.<init>(AmazonWebServiceClient.java:119)
at com.amazonaws.AmazonWebServiceClient.<init>(AmazonWebServiceClient.java:103)
at com.amazonaws.services.s3.AmazonS3Client.<init>(AmazonS3Client.java:357)
at com.amazonaws.services.s3.AmazonS3Client.<init>(AmazonS3Client.java:339)
However, when I inspect the jar for the method signature, I see it clearly listed:
vagrant@mesos:~/installs/spark-1.0.1-bin-hadoop2$ javap -classpath /tmp/rickshaw-spark-0.0.1-SNAPSHOT.jar org.apache.http.impl.conn.DefaultClientConnectionOperator
Compiled from "DefaultClientConnectionOperator.java"
public class org.apache.http.impl.conn.DefaultClientConnectionOperator implements org.apache.http.conn.ClientConnectionOperator {
protected final org.apache.http.conn.scheme.SchemeRegistry schemeRegistry;
protected final org.apache.http.conn.DnsResolver dnsResolver;
public org.apache.http.impl.conn.DefaultClientConnectionOperator(org.apache.http.conn.scheme.SchemeRegistry);
public org.apache.http.impl.conn.DefaultClientConnectionOperator(org.apache.http.conn.scheme.SchemeRegistry, org.apache.http.conn.DnsResolver); <-- it exists!
public org.apache.http.conn.OperatedClientConnection createConnection();
public void openConnection(org.apache.http.conn.OperatedClientConnection, org.apache.http.HttpHost, java.net.InetAddress, org.apache.http.protocol.HttpContext, org.apache.http.params.HttpParams) throws java.io.IOException;
public void updateSecureConnection(org.apache.http.conn.OperatedClientConnection, org.apache.http.HttpHost, org.apache.http.protocol.HttpContext, org.apache.http.params.HttpParams) throws java.io.IOException;
protected void prepareSocket(java.net.Socket, org.apache.http.protocol.HttpContext, org.apache.http.params.HttpParams) throws java.io.IOException;
protected java.net.InetAddress[] resolveHostname(java.lang.String) throws java.net.UnknownHostException;
}
I checked some of the other jars in the spark distribution - they don't seem have this particular method signature. So I'm left wondering what is being picked up by the Spark runtime to cause this issue. The jar is built on a maven project where I lined up the dependencies to ensure the correct aws java sdk dependency was being picked up as well.