1

I am trying to load data from a relational database in spark with Java language via the following code:

Dataset<Row> jdbcDF = spark.read()
 .format("jdbc")
  .option("url", "jdbc:postgresql:dbserver")
  .option("dbtable", "schema.tablename")
  .option("user", "username")
  .option("password", "password")
  .load();

as it is mentioned in the official documentation of spark https://spark.apache.org/docs/latest/sql-programming-guide.html , instead of schema.tablename, anything that is valid in a FROM clause of a SQL query can be used. For example, instead of a full table you could also use a subquery in parentheses.

I tried to put a simple select statement like 'select * from mydatabase.dbo.table1' instead of 'mydatabase.dbo.table1'

Dataset<Row> jdbcDF = spark.read()
     .format("jdbc")
      .option("url", "jdbc:postgresql:dbserver")
      .option("dbtable", "select * from mydatabase.dbo.table1")
      .option("user", "username")
      .option("password", "password")
      .load();

but I get the following error:

Exception in thread "main" com.microsoft.sqlserver.jdbc.SQLServerException: Syntaxe incorrecte vers le mot clé 'select'

I tried to put extra parenthese to englobe the select statement but this time I get the following error:

Exception in thread "main" com.microsoft.sqlserver.jdbc.SQLServerException: Syntaxe incorrecte vers le mot clé 'WHERE'.

Can anyone know if how can we get the result of a query in spark with java?

thanks a lot

thebluephantom
  • 16,458
  • 8
  • 40
  • 83
user3569267
  • 1,065
  • 3
  • 14
  • 27

0 Answers0