Help me please to write an optimal spark query. I have read about predicate pushdown:
When you execute where or filter operators right after loading a dataset, Spark SQL will try to push the where/filter predicate down to the data source using a corresponding SQL query with WHERE clause (or whatever the proper language for the data source is).
Will predicate pushdown works after .as(Encoders.kryo(MyObject.class))
operation?
spark
.read()
.parquet(params.getMyObjectsPath())
// As I understand predicate pushdown will work here
// But I should construct MyObject from org.apache.spark.sql.Row manually
.as(Encoders.kryo(MyObject.class))
// QUESTION: will predicate pushdown work here as well?
.collectAsList();