1

I am trying to upgrade from Spark 2.4 to Spark 3.2 version.
import org.apache.http.client.utils.URIBuilder is working on Spark 3.1 and fails on Spark 3.2 with error object client is not a member of package org.apache.http.

Dmytro Mitin
  • 48,194
  • 3
  • 28
  • 66
Surya
  • 21
  • 1
  • Maybe HTTPS instead of HTTP? I am not a field expert though. – user16217248 Apr 05 '23 at 05:20
  • 1
    You need to explain us more your setup: when do you get this error? Compilation, right? What's your build definition? – Gaël J Apr 05 '23 at 05:27
  • Wild guess: you're not explicitly defining the version of Apache HTTP client you want to use (and this maybe make sense if it's provided by Spark) and thus the version of HTTP client changed between the two spark versions and the imports as well. Check out the version of Apache HTTP client being provided and adapt your code for it. – Gaël J Apr 05 '23 at 05:29
  • Managed to reproduce https://scastie.scala-lang.org/DmytroMitin/PukZcEzyQzqYlTbbMBo5kw/1 https://scastie.scala-lang.org/DmytroMitin/PukZcEzyQzqYlTbbMBo5kw/2 – Dmytro Mitin Apr 05 '23 at 06:29

1 Answers1

1

The class org.apache.http.client.utils.URIBuilder is from "org.apache.httpcomponents" % "httpclient"

https://mvnrepository.com/artifact/org.apache.httpcomponents/httpclient

You should add

libraryDependencies += "org.apache.httpcomponents" % "httpclient" % "4.5.14"

to build.sbt. Then import org.apache.http.client.utils.URIBuilder will compile.

(In the 5th version https://mvnrepository.com/artifact/org.apache.httpcomponents.client5/httpclient5 this class is absent.)


Here is explanation of the reasons.

In Spark 2.4.8 "org.apache.httpcomponents" % "httpclient" is in the root pom.xml (<dependencyManagement>) and not overridden for example in spark-core pom.xml

<!--  org.apache.httpcomponents/httpclient-->
<commons.httpclient.version>4.5.6</commons.httpclient.version>
...

<dependencyManagement>
  <dependencies>
    ...
    <dependency>
      <groupId>org.apache.httpcomponents</groupId>
      <artifactId>httpclient</artifactId>
      <version>${commons.httpclient.version}</version>
    </dependency>

https://github.com/apache/spark/blob/v2.4.8/pom.xml#L499-L503

https://github.com/apache/spark/blob/v2.4.8/core/pom.xml

In Spark 3.2.0 this dependency is also present in the root pom.xml (<dependencyManagement>) but overridden for example in spark-core pom.xml (<dependencies>) so that the scope is test

<!--  org.apache.httpcomponents/httpclient-->
<commons.httpclient.version>4.5.13</commons.httpclient.version>
...

<dependencyManagement>
  <dependencies>
    ...
    <dependency>
      <groupId>org.apache.httpcomponents</groupId>
      <artifactId>httpclient</artifactId>
      <version>${commons.httpclient.version}</version>
    </dependency>
<dependencies>
  <!-- at least just for tests, coerce SBT to use the updated httpcore/client version -->
  <dependency>
    <groupId>org.apache.httpcomponents</groupId>
    <artifactId>httpclient</artifactId>
    <scope>test</scope>
  </dependency>

https://github.com/apache/spark/blob/v3.2.0/pom.xml#L622-L626

https://github.com/apache/spark/blob/v3.2.0/core/pom.xml#L370-L375

So if you need this dependency not only in test scope you should add it manually.

Differences between dependencyManagement and dependencies in Maven

Dmytro Mitin
  • 48,194
  • 3
  • 28
  • 66