SparkSqlParser
Spark SQL uses SparkSqlParser as the parser for Spark SQL expressions.
You can access SparkSqlParser
using SparkSession
(and SessionState
) as follows:
val spark: SparkSession = ...
val parser = spark.sessionState.sqlParser
scala> parser.parseExpression("select * from table")
res1: org.apache.spark.sql.catalyst.expressions.Expression = ('select * 'from) AS table#0
TIP: Enable INFO
logging level for org.apache.spark.sql.execution.SparkSqlParser
logger to see what happens inside.
SparkSession.sql Method
That alone won't give you the most bullet-proof shield against incorrect SQL expressions and think sql method is a better fit.
sql(sqlText: String): DataFrame Executes a SQL query using Spark, returning the result as a DataFrame. The dialect that is used for SQL parsing can be configured with 'spark.sql.dialect'.
See both in action below.
scala> parser.parseExpression("hello world")
res5: org.apache.spark.sql.catalyst.expressions.Expression = 'hello AS world#2
scala> spark.sql("hello world")
org.apache.spark.sql.catalyst.parser.ParseException:
mismatched input 'hello' expecting {'(', 'SELECT', 'FROM', 'ADD', 'DESC', 'WITH', 'VALUES', 'CREATE', 'TABLE', 'INSERT', 'DELETE', 'DESCRIBE', 'EXPLAIN', 'SHOW', 'USE', 'DROP', 'ALTER', 'MAP', 'SET', 'RESET', 'START', 'COMMIT', 'ROLLBACK', 'REDUCE', 'REFRESH', 'CLEAR', 'CACHE', 'UNCACHE', 'DFS', 'TRUNCATE', 'ANALYZE', 'LIST', 'REVOKE', 'GRANT', 'LOCK', 'UNLOCK', 'MSCK', 'EXPORT', 'IMPORT', 'LOAD'}(line 1, pos 0)
== SQL ==
hello world
^^^
at org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:217)
at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:114)
at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:48)
at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(ParseDriver.scala:68)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:638)
... 49 elided