8

I'm developing a C++ parser (for an IDE), so now trying to understand C++ grammar in details. While I've found an excellent grammar source at http://www.nongnu.org/hcb/, I'm having trouble understanding some parts of it - and, especially, which "real" language constructs correspond to various productions.

So I'm looking for a C/C++ BNF grammar guide with examples of code that match various productions/rules. Are there any?

intelfx
  • 2,386
  • 1
  • 19
  • 32
  • +1 for the link. I believe Eclipse is an open source IDE. How about having a look at its source code. – iammilind Aug 06 '12 at 08:37
  • @BartKiers I'm interested in constructions that are common for both C and C++, like declarators. – intelfx Aug 06 '12 at 08:42
  • @intelfx, ah, okay, I thought you were only interested in C++. – Bart Kiers Aug 06 '12 at 08:42
  • @iammilind Do you mean looking at comments to the parser's code in Eclipse? – intelfx Aug 06 '12 at 08:43
  • @intelfx, never mind. I understood your question other way. – iammilind Aug 06 '12 at 08:53
  • 3
    Any particular reason for not using an *existing*, high-quality parser (libclang comes to mind, or, if you can spare a dime, EDG). Creating a conforming C++ parser is a pain in the bum. No wonder EDG charges quality money for their quality product. – Konrad Rudolph Aug 06 '12 at 13:35
  • @KonradRudolph, EDG [license cost](http://www.edg.com/index.php?location=faq_q2_cost) seems bit high, if someone wants to be in research project. Any open source alternative (libclang ?). – iammilind Aug 07 '12 at 02:59
  • @iammilind Yes, I’d go with libclang (but see also Ira’s answer). That said, EDG sometimes [waives license costs for University researchers](http://www.edg.com/index.php?location=faq_q7_get). – Konrad Rudolph Aug 07 '12 at 06:43
  • You can't examine the "grammar" directly for EDG or clang, because AFAIK, they are implemented as hand-coded recursive descent parsers. You can obviously examine the code (to the extent you can get it) and I assume whatever comments it contains as to how the recursive descent part came about; they must have some ad hoc means at least to track the parser elements back to grammar rules. – Ira Baxter Aug 10 '12 at 08:09
  • You want the *common* part of the C and C++ declarations? I think you're in for a rough ride. First, there's a variety of C declaration types, including even a weak kind of generic in C(notC++)11; then there are C++ generics, and then there's C++11. You're going to have to simply examine the standards documents and of course they are not going to provide you with any clues about "commonality". And I suspect you really want to parse the full declarations, not just the common part. Might be easier(??) to simply implement parsers for C++(11), and another for C(notC++)11. – Ira Baxter Aug 10 '12 at 08:55
  • No... I just want to see something like an "annotated grammar summary" similar to that from C++11's Annex A - but with _bits of code_ that correspond to the productions shown. – intelfx Aug 10 '12 at 09:15

1 Answers1

7

A hyperlinked (purported) grammar is not necessarily one on which you can build a parser easily. That is determined by the nature of your parsing engine, and which real dialect of C and C++ you care about (ANSI? GNU? C99? C++11? MS?).

Building a working C++ parser is really hard. See my answer to Why C++ cannot be parsed with a LR(1) parser? for some of the reasons. If you want a "good" parser, I suggest you use one of the existing ones. One worth looking at might be Elsa, since it is open source.

Community
  • 1
  • 1
Ira Baxter
  • 93,541
  • 22
  • 172
  • 341
  • Well, it's not from scratch - there is a parser already (the IDE is KDevelop 4), and I wanted to improve it slightly. The parser is (going to be) C++11. But thank you for the link; I'll look at it. – intelfx Aug 10 '12 at 02:36
  • ...Unfortunately, Elsa is not C++11. – intelfx Aug 10 '12 at 03:33
  • And, a C++11 parser (including type checking) is actually a *lot* more work than a C++98 parser (with corresponding type checking). – Ira Baxter Mar 09 '13 at 11:04