0

i'm trying to modify C++ code. I get a piece of code and line numbers and i need to apply code at the given line numbers. Like this:

1 void foo(){
2 int a = 5;
3 int b = 10;
4 }

And the lines numbers: 2,3. Now i want to automatically insert Code after the lines numbers:

1 void foo(){
2 int a = 5;
3 newcode();
4 int b = 10;
5 newcode();
6 }

In another Thread people said antlr is a good way for this. So i tried using the antlr runtime api. Its easy to generate a parse Tree. I also found ways to modify it. But now i dont know how to get the source code back from the parse tree? I dont really need the source code, it would also be enough to just compile the parse tree to an executable program. How can i do this?

Is there maybe an easier way to solve my problem? Maybe just read the code, count the \n and after 2 and 3 \n i put the my code?

Edit: For my bachelor thesis, i get a piece of parallel code and i need to force it to execute a given interleaving. Therefore i have the job to write a tool to automatically insert instructions like "EnterCriticalSection(...)" and "LeaveCriticalSection(...)" at given lines in the code. Now, i got another job, to rename the main function and insert my own main function. I think this won't work with counting lines.

Michael
  • 1
  • 1
  • 1
    This sounds like an [XY problem](https://meta.stackexchange.com/q/66377). Why do you need to automatically modify C++ code this way? – eesiraed Mar 26 '20 at 18:39
  • For source-to-source transformations you might want to look at [clang's libTooling](https://clang.llvm.org/docs/LibTooling.html). – G.M. Mar 26 '20 at 18:39
  • IMO I would be nice to answer the question that was asked - which is a totally reasonable question - instead of suggesting the questioner should be doing something else. I'd like to know how to do it too. I consider it very likely that it _can't_ be done as ANTLR is a parser (generator) not a transducer, so if it can't be done that would be good to know too. – davidbak Mar 26 '20 at 18:41
  • BTW [this answer](https://stackoverflow.com/a/20381870/751579) by ANTLR's author will point you in the right direction (though it isn't a cookbook). – davidbak Mar 26 '20 at 18:45
  • 2
    if you want to modify /insert at specific line numbers why do you really need to parse it to a tree? just use a good old sed/awk to insert piece of text at line numbers. – Dr Phil Mar 26 '20 at 19:24

1 Answers1

0

A possible solution could be to use the generated parse tree for token positions (each TerminalNode has a Token instance attached with the information where it is located in the original source code). With that at hand you can start copying the unmodified text from the original source stream and then insert your own text, which belongs at this position. After that copy the next unmodifed code part and then insert your next modification. Do this in a loop until you reach EOF.

This scenario doesn't care for the final formatting, but I think that's probably not relevant - your tasks sounds like you are doing instrumentation of code for some measurements.

Here's code I use to retrieve the original source code given two parse tree nodes:

std::string MySQLRecognizerCommon::sourceTextForRange(tree::ParseTree *start, tree::ParseTree *stop, bool keepQuotes) {
  Token *startToken = antlrcpp::is<tree::TerminalNode *>(start) ? dynamic_cast<tree::TerminalNode *>(start)->getSymbol()
                                                                : dynamic_cast<ParserRuleContext *>(start)->start;
  Token *stopToken = antlrcpp::is<tree::TerminalNode *>(stop) ? dynamic_cast<tree::TerminalNode *>(start)->getSymbol()
                                                              : dynamic_cast<ParserRuleContext *>(stop)->stop;
  return sourceTextForRange(startToken, stopToken, keepQuotes);
}

//----------------------------------------------------------------------------------------------------------------------

std::string MySQLRecognizerCommon::sourceTextForRange(Token *start, Token *stop, bool keepQuotes) {
  CharStream *cs = start->getTokenSource()->getInputStream();
  size_t stopIndex = stop != nullptr ? stop->getStopIndex() : std::numeric_limits<size_t>::max();
  std::string result = cs->getText(misc::Interval(start->getStartIndex(), stopIndex));
  if (keepQuotes || result.size() < 2)
    return result;

  char quoteChar = result[0];
  if ((quoteChar == '"' || quoteChar == '`' || quoteChar == '\'') && quoteChar == result.back()) {
    if (quoteChar == '"' || quoteChar == '\'') {
      // Replace any double occurence of the quote char by a single one.
      replaceStringInplace(result, std::string(2, quoteChar), std::string(1, quoteChar));
    }

    return result.substr(1, result.size() - 2);
  }

  return result;
}

This code has been taken from the MySQL Workbench parser module.

Mike Lischke
  • 48,925
  • 16
  • 119
  • 181