15

I'm working on a compiler and I would like to improve its performances. I found that about 50% of the time is spent parsing the source files. As the source file are quite small and I do quite a lot of transformations after that, it seems to me that it is perfectible.

My parser is a Boost Spirit parser with a lexer (with lexer::pos_iterator) and I have a middle sized grammar. I'm parsing the source into an AST.

My problem is that I have no idea what takes the most time during the parsing: copies of AST nodes, lexer, parser rules or memory.

I don't think that it is I/O problem since I'm working on a SSD and that I'm reading the file entirely at the beginning and then using only the memory version.

I tried using profilers, but the methods that takes time are some methods from Boost with names hundreds of characters long and I don't know exactly what they do...

So, is there a preferred way to benchmark a Boost Spirit Parser and its grammar ? Or is there some rules that can be used to verify the efficiency at some specific points ?

Thanks

Sources for those interested:

Baptiste Wicht
  • 7,472
  • 7
  • 45
  • 110
  • 4
    Here is a write up by ApochiQ who is using Boost.Spirit as the parser for the Epoch language. He considerably improved the performance of his parser between the 10 and 11 releases and wrote up what he focused on [here](http://code.google.com/p/scribblings-by-apoch/wiki/OptimizingBoostSpirit). – Matthieu M. Jun 04 '13 at 13:20
  • Benchmarking ANYTHING typically involves running something through the "code under test", and then analysing the results. If you have a complex system, it often helps to produce a "null-driver" or "null-interface", so that you can for example feed in a source file and parse it, without acting on the parsed results. – Mats Petersson Jun 04 '13 at 13:28
  • @MatthieuM. Yes I know this article. I already followed several advices of this great article a long time ago. But I don't know what advice to follow next. – Baptiste Wicht Jun 04 '13 at 15:08
  • @MatsPetersson I already have a test case where the source files are mostly only parsed. I can benchmark only this part, but that don't give me much information. – Baptiste Wicht Jun 04 '13 at 15:10
  • 1
    Do you mind sharing the code under test? I'm interested myself – sehe Jun 04 '13 at 15:16
  • 1
    Have you run a profiler on your code? – Mats Petersson Jun 04 '13 at 15:19
  • Yes @MatsPetersson As I said in the question, I ran profiler on the code. The problem is that the function that comes as hotspot have names hundreds of chars long and I have no idea what they do exactly... This does not give me enough information... For not-sehe, I've added links to the source, but it is quite big. – Baptiste Wicht Jun 04 '13 at 20:54
  • @BaptisteWicht I gave things a look. I have explicitely not looked at attribute propagation anywhere because (a) it is vast (b) it looks you already optimized that quite a bit. Nice work, by the way. The code looks really neat and tidy. – sehe Jun 05 '13 at 03:34

1 Answers1

13

I have given things a quick scan.

My profiler quickly told me that constructing the grammar and (especially) the lexer object took quite some resources.

Indeed, just changing a single line in SpiritParser.cpp saved 40% of execution time1 (~28s down to ~17s):

    lexer::Lexer lexer;

into

    static const lexer::Lexer lexer;

Now,

  • making the grammar static involves making it stateless. I made this happen by

    • moving position_begin into qi::_a (using qi::locals) and
    • passing it in as an inherited attribute at the appropriate times

      • the EDDIGrammar and ValueGrammar grammars, e.g.

        start %= qi::eps [ qi::_a = qi::_r1 ] >> program;
        
      • as well as the individual rules from ValueGrammar that are being used externally).

    This had a number of suboptimal side effects:

    • rule debugging is commented out because the lexer::pos_iterator_type has no default output streaming overload
    • the qi::position(position_begin) expression has been 'faked' with a rather elaborate replacement:

      auto local_pos = qi::lazy (
              boost::phoenix::construct<qi::position>(qi::_a)
          );
      

      That doesn't seem ideal. (Ideally, one would like to replace qi::position by a modified custom parser directive that knows how to get get begin_position from qi::locals (?) so there would be no need to invoke a parser expression lazily?)

Anyways, implementing these further changes2 shaved off another ~15% of execution time:

static const lexer::Lexer lexer;
static const parser::EddiGrammar grammar(lexer); 

try {
    bool r = spirit::lex::tokenize_and_parse(
            position_begin, position_end,
            lexer, 
            grammar(boost::phoenix::cref(position_begin)),
            program);

Loose ideas:

  • Have you considered generating a static lexer (Generating the Static Analyzer)
  • Have you considered using expectation points to potentially reduce the amount of backtracking (note: I didn't measure anything in that area)
  • Have you considered alternatives for Position::file and Position::theLine? Copying the strings seems heavier than necessary. I'd prefer to store const char *. You could also look at Boost Flyweight
  • Is the pre-skip really required inside your qi::position directive?
  • (Somewhat non-serious: have you considered porting to Spirit X3? It seems to promise potential benefits in the form of move semantics.)

Hope this helps.


[1] When parsing all test cases in test/cases/*.eddi 100 times like so (github):

for (int i=0; i<100; i++)
    for (auto& fname : argv)
{
    eddic::ast::SourceFile program;
    std::cout << fname << ": " << std::boolalpha << parser.parse(fname, program, nullptr) << "\n";
}

Timed with a simple

time ./test ../../test/cases/*.eddi | md5sum

With the md5sum acting as a sanity check.

[2] I created a pull request with the proof-of-concept refactorings here https://github.com/wichtounet/eddic/pull/52

sehe
  • 374,641
  • 47
  • 450
  • 633
  • 2
    And you call that a quick scan oO Thanks a lot, it is a great answer. I'll try your patch tonight. For your loose ideas: I will try the static lexer (I didn't know about that before), I didn't know either that expectations points can reduce backtracking, I will try it ;) For Position, yes, I will avoid storing std::string, it is indeed heavy. I will check the pre-skip of qi:position. For Spirit X3, yes, I have considered testing it. Now that I have time again, I will probably try it after the other changes. Do you think it is mature enough for my parser ? – Baptiste Wicht Jun 05 '13 at 05:57
  • @BaptisteWicht Honestly, I wouldn't say X3 is ready for your project. However, you could give it a preview. It sure could lead you to "just wait for V3" instead of trying to optimize the heck out of the V2 implementation :) [yeah, the quick was mainly referring to the complete blind-siding of attribute propagation :< ] – sehe Jun 05 '13 at 11:47
  • I'm gonna try it (it probably won't be before some weeks though). Anyway, I merged and tested your PR, it works great. I was not able to get as much as you on my computer, but I was able to get 33% improvement. I'm working on your other points as well. – Baptiste Wicht Jun 05 '13 at 20:50
  • Oh aha. I didn't think the second part was 'merge-worthy' :| I'd just have taken the `static const` lexer line for quick win. I do suspect the whole 'position recording' thing could be done in a less complicated way (using a phoenix adapted functor that's passed in at parse time?). Anyways, I'm glad it worked :/ I had only verified that the parser returned 'true' still. – sehe Jun 05 '13 at 20:57