How can I setup my spark jdbc options to make sure I push down a filter predicate to the database and not load everything first? I'm using spark 2.1. Can't get the right syntax to use and I know I can add a where
clause after the load()
but that would obviously load everything first. I'm trying the below but whereas this filter would take a couple of seconds when running in my db client it doesn't return anything and just keeps running when trying to push down the predicate from spark jdbc.
val jdbcReadOpts = Map(
"url" -> url,
"driver" -> driver,
"user" -> user,
"password" -> pass,
"dbtable" -> tblQuery,
"inferSchema" -> "true")
val predicate = "DATE(TS_COLUMN) = '2018-01-01'"
// Also tried -> val predicate = "SIMPLEDATECOL = '2018-01-01'"
val df = spark.read.format("jdbc")
.options(jdbcReadOpts)
.load().where(predicate)