Write dataframe to Teradata in Databricks

Question

I am writing df data into one of Teradata table then i am getting below error. I am able to read the data just getting error for writing. I am not getting why same driver is working for writing the dataframe.

df=spark.sql("""select * from tbl_name""")
config_dict={"JDBCURL":"jdbc:teradata//${jdbcHostname}/database=${jdbcDatabase}","JDBCDriver":"com.teradata.jdbc.TeraDriver","DB_User":"dbc","DB_PWD":"dbc"}
jdbcUrl=config_dict["JDBCURL"]
jdbcDriver =config_dict["JDBCDriver"]
user=config_dict["DB_User"]
password=config_dict["DB_PWD"]
df=spark.sql("""select * from testing""")
df.write.format("jdbc") \
      .mode("overwrite") \
      .option("url", jdbcUrl) \
      .option("dbtable","database.tbl_name") \
      .option("user", user) \
      .option("password",password) \
      .save()

error:>> java.sql.SQLException: No suitable driver

can someone help me on it.

I tried to connect teradata manually then i am able to connect with same credentials.

score 0 · Answer 1 · answered Apr 22 '23 at 11:47

You need to install JDBC driver for Teradata before running the code. You can do it by attaching the Teradata JDBC driver as library to your cluster:

using the Maven coordinates.
if it's already downloaded, you can put the file to DBFS and refer the library by its DBFS path

See docs for instructions on working with libraries.

score 0 · Answer 2 · answered Apr 22 '23 at 14:58

You need to install the driver as pointed out and then specify driver class name as an option. driver="com.teradata.jdbc.TeraDriver" below.

from pyspark.sql import SparkSession

spark = SparkSession.builder
                    .appName("Teradata connect")
                    .getOrCreate()
df = spark.read
          .format("jdbc")
          .options(url="jdbc:teradata://xy/",
                   driver="com.teradata.jdbc.TeraDriver",
                   dbtable="dbname.tablename",
                   user="user1",password="***")
          .load()

For driver installation see:

You will probably also need STRICT_NAMES=OFF option since pyspark passes its own options to the driver. — Fred, Apr 24 '23 at 15:32

Write dataframe to Teradata in Databricks

2 Answers2