149

Is there an open-source Java library for parsing SQL statements?

If possible, it should be customizable or flexible enough to also be able to parse (or at least ignore) vendor-specific syntax (such as Oracle tablespace definitions or MySQL's LIMIT clause).

If not, strict adherence to the SQL standard is also fine.

Update: I need this for two things:

  • providing an SQL interface to a non-SQL database (mapping to internal API calls)
  • rewriting SQL before it goes to the actual database (e.g. Oracle)
John Hascall
  • 9,176
  • 6
  • 48
  • 72
Thilo
  • 257,207
  • 101
  • 511
  • 656
  • 123
    how is this question not constructive? what's wrong with you, Stackoverflow? – anton1980 Feb 22 '13 at 21:22
  • 12
    This question should be reopened. Also, found http://foundationdb.github.io/sql-parser/ – kervin May 02 '14 at 23:56
  • 2
    This is off-topic - requests for libraries, tools, tutorials and other off-site resources are not considered on-topic for StackOverflow. – Toby Speight Mar 30 '16 at 10:08
  • I think this should be asked on another SE site – Peter Chaula Dec 28 '16 at 20:34
  • 4
    I think this question is constructive!! – Casualet Jan 05 '17 at 07:48
  • Just in case anyone needs a more feature rich parser that is open source, supports multiple dialects, AST / visitors, checkout Alibaba's Druid: https://github.com/alibaba/druid/tree/master/src/main/java/com/alibaba/druid/sql – btiernay Mar 19 '17 at 22:05
  • [jOOQ has a parser](https://www.jooq.org/doc/latest/manual/sql-building/sql-parser/). You can test it online, here: https://www.jooq.org/translate. It can be used to rewrite your SQL. – Lukas Eder Aug 24 '22 at 14:30

7 Answers7

54

ANTLR3 has an ANSI SQL grammar available. You can use that to create your own parser.

ANTLR4 has a SQL grammar.

Dave Jarvis
  • 30,436
  • 41
  • 178
  • 315
duffymo
  • 305,152
  • 44
  • 369
  • 561
  • 6
    Why use ANTLR when you can implement your own parser generator? – IAdapter May 17 '10 at 14:52
  • 138
    Why generate your own parser generator when you can use ANTLR? – duffymo May 17 '10 at 17:35
  • Any links to how one can use Antlr SQL Grammar to parse these queries? I looked at the grammar of some PL/SQL Parsers as well as Lexers and Parsers but was unable to fathom how to use one. Would appreciate any links. – Abhishek Jan 29 '12 at 13:45
  • You feed the grammar to ANTLR, which spits out the lexer/parser classes that you'll then compile and run. The best source I can think of is the ANTLR reference: http://www.amazon.com/Definitive-Antlr-Reference-Domain-Specific-Programmers/dp/0978739256/ref=sr_1_1?ie=UTF8&qid=1327847925&sr=8-1 – duffymo Jan 29 '12 at 14:39
  • We're up to ANTLR 4 now. Perhaps the old grammars don't run on the new version. – duffymo Jan 22 '13 at 17:50
  • @SHC see also http://stackoverflow.com/a/14465585/268066 – CrazyPyro Mar 20 '14 at 22:26
  • Careful though, the available grammars are not necessarily under the same license as ANTLR – kutschkem Apr 03 '14 at 11:43
  • ANTLR is great and nice to use but very slow, there's probably faster ones out there – Nathan Adams Jul 15 '21 at 22:17
  • Such as? A recommendation or faster code from you would be more helpful. – duffymo Jul 15 '21 at 23:01
  • @TheRealChx101 not seeing your point. How is this comment on a 14 year old question advancing anything? Is this the best you can come up with after 11 years on this site? – duffymo Jul 17 '23 at 13:01
  • @duffymo Why should I advance anything other than my own intellect and well being? Don't get easily triggered, and especially after 14 years. :) – TheRealChx101 Jul 17 '23 at 14:58
  • Fail to see how your comments advance anything. Your rep isn't much to show after 11 years of effort, so it's not a surprise. – duffymo Jul 17 '23 at 15:21
39
  • JSqlParser
  • Trino's parser is written using ANTLR4 and has its own immutable AST classes that are built from the parser. The AST has a visitor and pretty printer.
MiB
  • 575
  • 2
  • 10
  • 26
David Phillips
  • 10,723
  • 6
  • 41
  • 54
  • I have a question for Presto, If I have Statement statement = SQL_PARSER.createStatement(query); How can I get the Query body, i.e. the Select, From, Where, etc values? – quarks Apr 04 '15 at 17:56
  • 1
    Statement is a base class. A SELECT statement will be of type Query. It contains a QueryBody that has the subclass QuerySpecification. The structure is more complex than you might expect in order to support UNION, TABLE, VALUES, set operations, etc. You can create a visitor by extending AstVisitor or DefaultTraversalVisitor. Look at SqlFormatter for an example of how to walk the tree. – David Phillips Apr 10 '15 at 05:40
  • Would it be possible to get the query tree structure from a given query using Presto? – MockedMan.Object Jun 01 '15 at 10:30
  • I'm not sure what you're asking. Can you create a new question with more detail? – David Phillips Jun 01 '15 at 15:35
9

Parser

If you need a parser there should be a parser in the code base of Apache Derby.

Dealing with vendor-specific SQL

You may want to look at the .native() method on the jdbc Connection object which you can pass it vendor neutral queries that will get postprocessed into vendor specific queries.

Elijah
  • 13,368
  • 10
  • 57
  • 89
  • That native() method looks interesting. Are there any examples as to how it can be used? What kind of conversions are possible there? – Thilo Mar 19 '09 at 00:46
  • @Thilo e.g. [SQL Server 2014, nativeSQL Method (SQLServerConnection)](http://technet.microsoft.com/en-us/library/ms378174%28v=sql.110%29.aspx): _"This method is not currently supported by the Microsoft JDBC Driver for SQL Server."_ – Gerold Broser Sep 22 '14 at 20:55
  • @Thilo OJDBC's [OracleConnectionWrapper](http://docs.oracle.com/cd/E11882_01/appdev.112/e13995/oracle/jdbc/OracleConnectionWrapper.html#nativeSQL_java_lang_String_) seems like supporting it. – Gerold Broser Sep 22 '14 at 21:03
7

General SQL Parser for Java is not open source, but is exactly what you are looking after.

James
  • 81
  • 1
  • 1
6

Try

  • Zql (Original Sourceforce)
  • Zql (Github)
  • Zql (Maven Central)
MiB
  • 575
  • 2
  • 10
  • 26
jagamot
  • 5,348
  • 18
  • 59
  • 96
  • 5
    zql is good for basic queries but when you try to parse a query that contains join statement, it blows up. so i don't suggest it – zato Feb 16 '13 at 10:16
3

Hibernate uses ANTLR for sql and hql parsing.

JSqlParser is also a good option.although it has some bugs(or some features not implemented) while parsing oracle pl/sql. see its forum for detail.

so if you're parsing oracle pl/sql, ANTLR is recommended.

Han Zheng
  • 378
  • 1
  • 3
  • 13
2

What you want to do with the parsed SQL? I can recommend a few Java implementation of Lex/Yacc (BYACC/J, Java Cup) that you can use an existing SQL grammar with.

If you want to actually do something with the resulting parsed grammar, I might suggest looking at Derby, an open source SQL database written in Java.

Thomas Jones-Low
  • 7,001
  • 2
  • 32
  • 36