For my finale year project I'm learning about compiler techniques, and currently I'm trying to experiment with the GCC intermediate representation (raw GIMPLE) and getting the control flow graphs from different source files (C, Cpp and Java) using GCC-5.4.
So far i can generate *.004t.gimple
and *.011t.cfg
raw files using -fdump-tree-all-graph-raw
but later I'm looking to understand more the GIMPLE language so i searched for its grammar and i have found this :
- GIMPLE WIKI
- SIMPLE
- GENERIC and GIMPLE
- latest GIMPLE Doc (has no grammar!!!)
- GCC FE
- grammar for gcc-4.3.6
- grammar for gcc-4.2.1
- GIMPLE Doc for gcc-5.4.0 (has no grammar too!!!)
So the language seems to be constantly changing and have multiple formats (High level GIMPLE, Low_level_GIMPLE, SSA GIMPLE, tree) and also the grammar seems to keep changing between versions but i can't find the GIMPLE grammar for the recent versions and specifically the one used in GCC-5.4 and i can't understand the different formats.
Questions about the grammar :
- where can i find the GIMPLE grammar used in GCC-5.4 and more recent versions?
- how is it written ? (in BNF or EBNF or ...)
- How does GCC implement this grammar to generate, parse and understand Gimple files it generates and later transform them to RTL?
- is it possible for me to write a small subset of the GIMPLE grammar
in Xtext from examples of
*.004t.gimple
files that i generate?
Questions about the formats:
- What's the difference between the 3 Gimple formats? (i can't seem to find detailed documentation about each one in the wiki)
- which format is used in the raw files
*.c.004t.gimple
and*.c.011t.cfg
? (High or Low, ...) - which one represents better the control flow from the original source code without optimizations ?
Thank You,