2

I am using Azure databricks with LTS 7.3 and spark 3.0 (PySpark) with com.microsoft.azure.kusto:kusto-spark_3.0_2.12:2.9.1 connector for quite sometime now but recently my jobs are failing with below errors (randomly, sometimes they run and othertimes they just simply fail)

df = pyKusto.read                                                        \
           .format("com.microsoft.kusto.spark.datasource")               \
           .option("kustoCluster", kustoOptions["kustoCluster"])          \
           .option("kustoDatabase", kustoOptions["kustoDatabase"])         \
           .option("kustoQuery", Query)                                    \
           .option("kustoAadAppId", kustoOptions["kustoAadAppId"])           \
           .option("kustoAadAppSecret", kustoOptions["kustoAadAppSecret"])    \
           .option("kustoAadAuthorityID", kustoOptions["kustoAadAuthorityID"]) \
           .load()
java.lang.ClassNotFoundException: Failed to find data source: com.microsoft.kusto.spark.datasource. Please find packages at http://spark.apache.org/third-party-projects.html

I have already installed the library on the cluster and it has been running for sometime without issues but not sure what's happening to it recently. Please suggest any workaround if anyone have seen this issue?

Thanks

Bha123
  • 53
  • 5
  • 2 general comments: **(1)** There is reason to create a new spark context (`pyKusto`) when you're working with Databricks and there's already a pre-created one (`spark`). **(2)** wrap a multi-line expression with brackets, and you won't need all those new line symbols: `df = (pyKusto.read. ... .load())`. – David דודו Markovitz Jun 25 '22 at 10:34
  • 1
    I think @DavidדודוMarkovitz meant "There is _no_ reason to create a new spark context ..." – Vladik Branevich Jun 26 '22 at 06:26

1 Answers1

0

In Databricks try to upgrade kusto-spark library from kusto-spark_3.0_2.12:2.9.1 to kusto-spark_3.0_2.12:3.0.0:

Libraries -> Install New -> Maven -> copy the following coordinates:

com.microsoft.azure.kusto:kusto-spark_3.0_2.12:3.0.0

If it still not works, you can create new issue here

Refer - https://github.com/Azure/azure-kusto-spark#Linking

Abhishek K
  • 3,047
  • 1
  • 6
  • 19
  • 1
    Even after upgrading the library, the jobs are still failing. I will have to raise a ticket – Bha123 Jun 27 '22 at 14:11