I am trying to solve the following problem on databricks (on Azure): I essentially want to analyze the physical plan of a query before it's execution. The idea is essentially that if the physical plan does contain a certain path, I want to fail the query execution. I need to analyze the Physical Plan and not the Logical Plan, as I want to block commands that read from a certain path. However when I use spark.read.parquet(path) the path doe not show up in the Logical Plan but does show up in the physical plan. Further, I cannot use access restrictions as I want to block this only for certain clusters in a databricks workspace and not for all clusters.
I found the QueryExecutionListener which can be extended to create a custom class and override the functions onSuccess and onFailure. However these functions are only executed post the success/failure of the query and thus doesn't suit my case. Alternatively I found that we can extend the Rule class from org.apache.spark.sql.catalyst.rules.Rule and override the apply function. However, in this scenario I can only analyze the Logical Plan and not the Physical Plan.