I am trying to read the data from the MSSQL database using Spark jdbc with a specified offset. So the data should be loaded only after the specified timestamp which would be this offset. I tried to implement it by providing a query in jdbc configuraions, however, I did not found a possibility to create a prepared statement with parameterized values. In this case, I want to parameterize an offset which would change after each application launch. How can I implement it using jdbc options?
All the database configs reside in the application.conf file. This is the way I read from the database:
def jdbcOptions(query: String) = Map[String,String](
"driver" -> config.getString("sqlserver.db.driver"),
"url" -> config.getString("sqlserver.db.url"),
"dbtable" -> s"(select * from TestAllData where update_database_time >= '2019-03-19 12:30:00.003') as subq,
"user" -> config.getString("sqlserver.db.user"),
"password" -> config.getString("sqlserver.db.password"),
"customSchema" -> config.getString("sqlserver.db.custom_schema")
)
val testDataDF = sparkSession
.read
.format("jdbc")
.options(jdbcOptions())
.load()
Instead a query should look almost like this:
s"(select * from TestAllData where update_database_time >= $tmstp) as subq