I am trying to load data from a relational database in spark with Java language via the following code:
Dataset<Row> jdbcDF = spark.read()
.format("jdbc")
.option("url", "jdbc:postgresql:dbserver")
.option("dbtable", "schema.tablename")
.option("user", "username")
.option("password", "password")
.load();
as it is mentioned in the official documentation of spark https://spark.apache.org/docs/latest/sql-programming-guide.html
, instead of schema.tablename, anything that is valid in a FROM clause of a SQL query can be used. For example, instead of a full table you could also use a subquery in parentheses.
I tried to put a simple select statement like 'select * from mydatabase.dbo.table1'
instead of 'mydatabase.dbo.table1'
Dataset<Row> jdbcDF = spark.read()
.format("jdbc")
.option("url", "jdbc:postgresql:dbserver")
.option("dbtable", "select * from mydatabase.dbo.table1")
.option("user", "username")
.option("password", "password")
.load();
but I get the following error:
Exception in thread "main" com.microsoft.sqlserver.jdbc.SQLServerException: Syntaxe incorrecte vers le mot clé 'select'
I tried to put extra parenthese to englobe the select statement but this time I get the following error:
Exception in thread "main" com.microsoft.sqlserver.jdbc.SQLServerException: Syntaxe incorrecte vers le mot clé 'WHERE'.
Can anyone know if how can we get the result of a query in spark with java?
thanks a lot