1

I'm fairly new to Spark and I was wondering if there is a way to specifically know whether a given query is a DML or a DDL statement via the SparkContext/SQLContext APIs?

I was looking at AstBuilder.scala and SparkSqlParser.scala but looks like they parse the query into an AST using visit callback methods and call different functions in ddl.scala.

If not, how would you go about implementing a method which would tell us that using the Spark SQLContext APIs?

sbrk
  • 1,338
  • 1
  • 17
  • 25
  • Is there something about SparkSQL that makes it difficult to tell the difference? Does SparkSQL show you the SQL it's generating or running? Provided you can see the SQL statement or query you can tell if it's DDL or DML (or DCL or TCL) simply by looking at the keywords, see here: https://stackoverflow.com/questions/2578194/what-are-ddl-and-dml – Dai Aug 19 '20 at 01:48
  • Does this help? https://stackoverflow.com/questions/40505116/how-can-see-the-sql-statements-that-spark-sends-to-my-database – Dai Aug 19 '20 at 01:51
  • @Dai Eyeballing works, but I want to do it programatically. – sbrk Aug 19 '20 at 01:57
  • All I can suggest is copying the SQL statement to a string (and make sure it's a single statement, not a batch) and using a proven SQL parser library to identify the statement keywords. Don't use a regular-expression or try to parse it yourself because it's very easy to contrive a SQL query or statement that looks like something else, and because SQL's grammar is horribly complicated. You'll need to use a parser library that exactly matches your RDBMS because each SQL dialect is very different. – Dai Aug 19 '20 at 02:00

0 Answers0