0

I am trying to import json documents stored on Azure Data Lake Gen2 to SQL Server database using the code below but run into the following error. But when I read data from SQL Server the jdbc connection works.

Error Message: The driver could not open a JDBC connection.

Code:

df = spark.read.format('json').load("wasbs://<file_system>@<storage-account-name>.blob.core.windows.net/empDir/data";)
val blobStorage = "<blob-storage-account-name>.blob.core.windows.net"
val blobContainer = "<blob-container-name>"
val blobAccessKey =  "<access-key>"
val empDir = "wasbs://" + blobContainer + "@" + blobStorage +"/empDir"
val acntInfo = "fs.azure.account.key."+ blobStorage
sc.hadoopConfiguration.set(acntInfo, blobAccessKey)
val dwDatabase = "<database-name>"
val dwServer = "<database-server-name>"
val dwUser = "<user-name>"
val dwPass = "<password>"
val dwJdbcPort =  "1433"
val sqlDwUrl = "jdbc:sqlserver://" + dwServer + ":" + dwJdbcPort + ";database=" + dwDatabase + ";user=" + dwUser+";password=" + dwPass
spark.conf.set("spark.sql.parquet.writeLegacyFormat","true")
df.write.format("com.microsoft.sqlserver.jdbc.SQLServerDriver").option("url", sqlDwUrl).option("dbtable", "Employee").option( "forward_spark_azure_storage_credentials","True").option("tempdir", empDir).mode("overwrite").save()

Also how to insert all the json documents from empDir directory into the employee table?

ITHelpGuy
  • 969
  • 1
  • 15
  • 34
  • format is jdbc and the driver class is what you set. – Lamanus Aug 22 '20 at 02:29
  • Does this answer your question? [Working with jdbc jar in pyspark](https://stackoverflow.com/questions/36326066/working-with-jdbc-jar-in-pyspark) – Lamanus Aug 22 '20 at 02:29
  • Hi @Lamanus, I get the error **org.apache.spark.sql.AnalysisException: Table or view not found: dbo.Employee** even though the table exists. **Code:** `df.write.format("jdbc").option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver").option("url", "jdbc:sqlserver://dburl:1433;database=db;user=usr;password=pwd").insertInto("dbo.Employee")` – ITHelpGuy Aug 24 '20 at 18:14

1 Answers1

1

You will receive this error message: org.apache.spark.sql.AnalysisException: Table or view not found: dbo.Employee when there is no associated table or view created which you are referring. Make sure the code is pointing to the correct database [Azure Databricks Database (internal) or Azure SQL Database (External)]

You may checkout the query addressed on Microsoft Q&A - Azure Databricks forum.

Writing data to Azure Databricks Database:

To successfully insert data into default database, make sure create a Table or view.

enter image description here

Checkout the dataframe written to default database.

enter image description here

Writing data to Azure SQL database:

Here is an example on how to write data from a dataframe to Azure SQL Database.

enter image description here

Checkout the dataframe written to Azure SQL database.

enter image description here

CHEEKATLAPRADEEP
  • 12,191
  • 1
  • 19
  • 42