2

I have pyspark code deployed on azure databricks, which basically reads and writes data to and from cosmosDB. In previous months, I used: DataBricks runtime Version: 6.4(Extended Support) with Scala 2.11 and Spark 2.45 Azure-cosmosdb-spark connector: azure-cosmosdb-spark_2.4.0_2.11-3.7.0-uber.jar Program runs fine with out any errors.

But, when I upgraded my runtime version to: DataBricks runtime Version: 10.4 LTS with Scala 2.12 and Spark 3.2.1 Azure-cosmosdb-spark connector: azure-cosmos-spark_3-2_2-12-4.10.0.jar

It throws me a error saying:

---------------------------------------------------------------------------
Py4JJavaError                             Traceback (most recent call last)
<command-2873073042801780> in <module>
     14 #Reading data from cosmosdb
     15 #try:
---> 16 input_data_cosmosdb = spark.read.format("com.microsoft.azure.cosmosdb.spark").options(**ReadConfig_input_cosmos).load()
     17 print("Cosmos DB columns ",input_data_cosmosdb.columns)
     18 #except:

/databricks/spark/python/pyspark/sql/readwriter.py in load(self, path, format, schema, **options)
    162             return self._df(self._jreader.load(self._spark._sc._jvm.PythonUtils.toSeq(path)))
    163         else:
--> 164             return self._df(self._jreader.load())
    165 
    166     def json(self, path, schema=None, primitivesAsString=None, prefersDecimal=None,

Py4JJavaError: An error occurred while calling o663.load.
: java.lang.NoSuchMethodError: scala.Predef$.refArrayOps([Ljava/lang/Object;)Lscala/collection/mutable/ArrayOps;
    at com.microsoft.azure.cosmosdb.spark.config.Config$.getOptionsFromConf(Config.scala:281)
    at com.microsoft.azure.cosmosdb.spark.config.Config$.apply(Config.scala:229)
    at com.microsoft.azure.cosmosdb.spark.DefaultSource.createRelation(DefaultSource.scala:55)
    at com.microsoft.azure.cosmosdb.spark.DefaultSource.createRelation(DefaultSource.scala:40)
    at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:385)
    at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:356)
    at org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:323)
    at scala.Option.getOrElse(Option.scala:189)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:323)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:222)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)
    at py4j.Gateway.invoke(Gateway.java:295)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.GatewayConnection.run(GatewayConnection.java:251)
    at java.lang.Thread.run(Thread.java:748)

I have tried all the solutions available on internet. I know it comes when I am using 2.11 libraries on 2.12 project, but i am using 2.12 project and 2.12 spark connector itself. Yet i get the error.

Please let me know if any one is using the same environment. I use Databricks on Azure. Looking for a solution.

Any answers related to Databricks are really appreciated.

Alex Ott
  • 80,552
  • 8
  • 87
  • 132
Ashika
  • 21
  • 1
  • what other libraries are attached to the cluster? it looks like you still have library that uses Scala 2.11 – Alex Ott Jun 03 '22 at 06:15
  • other ones are azure_storage_blob-1.5.0-py2.py3-none-any.whl, azure-cosmos, xlrd, mysql-connector-python – Ashika Jun 03 '22 at 13:05
  • I think the cosmos connector was compiled in scala 2.11 (https://mvnrepository.com/artifact/com.microsoft.azure/azure-cosmosdb-spark_2.4.0_2.11) so It gives that error. If you find a way to use runtime 10.4 LTS and scala 2.11 I think it will work. For now only with runtime 6.4 that uses scala 2.11 you can use that cosmos connector. – Luis Tiago Flores Cristóvão Jul 20 '22 at 13:31
  • https://stackoverflow.com/questions/75947449/run-a-scala-code-jar-appear-nosuchmethoderrorscala-predef-refarrayops – Dmytro Mitin Apr 07 '23 at 05:07

0 Answers0