1

In ClaiR it is not (yet) possible to write changes made in the AST back to file. For this reason, I create a list lrel[int, int, str] changes = []; with startposition and endposition of the substring to remove, and a string with which it needs to be replaced.

When I have a full list of changes I want to make to a source file, I sort the changes and open the file with fb = chars(readFile(f));

make the changes

public list[int] changeCharList(list[int] charList, lrel[int, int, str] changesList) {
    int offset = 0; 
    for (t <- [0 .. size(changesList)]) {
        tuple[int startIndex, int endIndex, str changeWithString] change = changesList[t];
        int startIndexWithOffset = change.startIndex + offset;
        int endIndexWithOffset = change.endIndex + offset;
        list[int] changeWithChars = chars(change.changeWithString);
        for (i <- [startIndexWithOffset .. endIndexWithOffset]) {
            charList = delete(charList, startIndexWithOffset);
        }
        for (i <- [0 .. size(changeWithChars)]) {
            charList = insertAt(charList, startIndexWithOffset + i, changeWithChars[i]); 
        }
        offset += size(changeWithChars) - (change.endIndex - change.startIndex);
    }
    return charList;
}

and write to file writeFileBytes(f, fb);

This approach works for source files without expanded macros, but it does not work for sources files with expanded macros. In the later case the offsets used in the AST do not map the offsets with the file opened using readFile.

As a workaround I can comment macros before running Rascal and uncomment them after running Rascal. I do not like this.

Is there a way to recalculate the offsets in such a way that the AST offsets map the file read offsets?

Uwe Keim
  • 39,551
  • 56
  • 175
  • 291
Matty
  • 134
  • 1
  • 7
  • 1
    this is a beautiful question, with a long and complex answer. The problem has been studied in general in literature but there is no off-the-shelf solution for Clair at this moment. I recommend talking to the author of Clair, Rodin Aarssen, directly to see what your options are. – Jurgen Vinju Jan 31 '20 at 10:34
  • I contacted Rodin – Matty Jan 31 '20 at 15:08
  • Is it possible to easily detect such a situation e.g. by comparing the largest position in the AST to the last character in a file read? – Matty Jan 31 '20 at 15:09
  • That's an idea; I don't know I'm also very interested to see what you two come up with. – Jurgen Vinju Jan 31 '20 at 16:07
  • Okay. I want to change thousands of files of which a small percentage will have this problem. And I want to script to be executed by a colleague. I do not want to frustrate him too much :) So I am thinking of the following approach. – Matty Feb 01 '20 at 08:17
  • Assumption: include files are not found by ClaiR. I do not need them and this guarantees that there will be no macro expansion because of an include file. Step 1: check if there is macro expansion (by comparing size of larges loc in AST with size of "normal" file read). If no macro expansion, (2) perform refactoring, else (3) pre-process file. Step 3: do "normal" file read/write to comment out macros. Continue with (2) perform refactoring. Then (4) undo changes made by step (3) as a post-processing step. Do you have feedback or suggestions for improving this approach? – Matty Feb 01 '20 at 08:23
  • It's an idea; I'd probably start testing the detection method separately. Doesnt an unresolved include also lead to a file size change? – Jurgen Vinju Feb 01 '20 at 08:48
  • Is there a way to iterate through all the nodes of an AST? – Matty Feb 01 '20 at 09:30
  • 1
    Sure; `/node subtree <- wholeTree` – Jurgen Vinju Feb 01 '20 at 09:53
  • Watch out for additional parse errors too, right? Due to non-expansion. – Jurgen Vinju Feb 01 '20 at 09:54
  • I tried `/node n <- ast; println(n);` but this does not print anything. I tried `visit (ast) { case n:\node(): { println(n); } }` and this also does print anything. If I do /node in a case statement, I get parse errors. How can I visit all nodes? – Matty Feb 03 '20 at 09:40
  • 1
    Hi! I'll send you some slides too but for all non-unitary patterns, the ones that can match in multiple ways you can iterate over all matches using a for loop, find the first match using an if, and iterate over all matches using a comprehension. The / pattern matches its nested pattern everywhere in a value from left to right, bottom to top. So `for (/node s := v) println(s); ` prints all sub values which are a node including the root. And `[s | /int s <- v]` collects all integers nested in b in a list. – Jurgen Vinju Feb 03 '20 at 09:53
  • I did not (yet) receive slides. The following works: `for (/node s := ast) println(s);`. However, I want the URI of a node. So I tried `for (/node s := ast) println(s.src);`. But this crashes with `NoSuchField("src")`. I guess there are nodes in the AST without a src. How can I check if a node has a src field before printing? – Matty Feb 05 '20 at 07:13
  • 1
    I forgot to send the slides. You could use the ? expression operator, like so: `s.src ? |unknown://|` it means to take the left expression if it is defined otherwise the right expression. – Jurgen Vinju Feb 05 '20 at 07:18
  • `for (/node s := ast) { if (s.src?) { println(s.src); } }` works. Thanks! – Matty Feb 05 '20 at 07:42
  • 1
    Nice. while you're at it learning more Rascal idiom: `for (/node s := ast, s.src?) println(s.src);` or `for (/node s := ast) println(s.src?|unknown://|)` or `[ s.src | /node s := ast, s.src?]` – Jurgen Vinju Feb 05 '20 at 09:01
  • I tried `for (/node s := ast) println(s.src?|unknown://|);` first, but it results in `MalFormedURI("unknown://")`. The other options do work. – Matty Feb 05 '20 at 09:26
  • 1
    Ah right. It needs one more slash – Jurgen Vinju Feb 05 '20 at 09:27
  • Yes, now it work :-) – Matty Feb 05 '20 at 09:29
  • A new field is added to AST nodes which can be used to check if an expression is part of a macro expansion. With this we have a detection method. Thanks for the support! – Matty Feb 09 '20 at 17:05
  • Nice solution people. Have fun! – Jurgen Vinju Feb 09 '20 at 17:41

0 Answers0