CFG / PEG used for Code completion?

Question

I'm wondering if it's possible to use a CFG or PEG grammar as a basis for code completion directly without modification. I have heard that code completion is in IDE's is sometimes manipulated and massaged or even hard coded so that it performs better.

I want to code complete on a small DSL so I fully understand that a grammar cannot help a code completion system with knowledge of library functions etc.

As far as I'm aware the parser itself needs to at least provide a system for querying what it expects next.

In particular I'm interested in a javascript code completion solution using peg.js or jison

It is possible indeed to use PEG (with Packrat) for code completion - I did that with both Emacs and Visual Studio. The trick is to store a list of failed "tokens" tried by a parser a the rightmost failure position - they can be used then for a completion. If an identifier is expected, parser can also give a hint. — SK-logic, Aug 16 '11 at 14:21

score 14 · Accepted Answer · edited May 19 '16 at 03:52

It is fairly easy to build Javascript editor with code completion from PEG grammar. I would describe how to do it with PEG.js. You need to extend your grammar with some lax parsing rules that allow to provide suggestions when previous statements are broken. These lax rules need to be handled conditionally or you will need two separate grammars - one for parsing source and second for code completion. You can maintain one grammar by using Javascript predicates (available in PEG.js). It looks like &{return laxParsing} and it causes that whole rule to be processed when laxParsing flag is true. You can switch between lax and strict parsing easily by setting parser's internal flag.

To provide suggestions to user easily you must modify slightly generated PEG.js parser (version 0.5) to receive in the parsing error structure position (beside the column and line) and list of expectations (beside the error message). You can copy prepared fragment from https://gist.github.com/1281239.

When you have parser then you can attach it in editor on for example CTRL+SPACE keypress. When these are pressed in text source you need to put a special unparseable sign in place of cursor (to cause a parsing error) and launch parser in lax mode. Then you receive an error with list of suggestions.

Some of suggestions are not only syntax but also they define references (e.g. entities, variables). You can trigger searching these when a particular expectation is found (e.g. VariableName). You can provide the list by parsing the same source in a different lax parsing mode (filtering only variable names).

For a working example and source to this approach you can check on https://github.com/mstefaniuk/Concrete-Freetext.

score 5 · Answer 2 · answered Nov 13 '14 at 21:51

PEG.js gives you quite a bit of context when it generates a SyntaxError. For example, if you have a grammar for SQL and feed it something like:

FOO > 10 A

Then PEG.js will return this:

{
  "message": "Expected \"AND\", \"ORDER BY\" or end of input but \"A\" found.",
  "expected": [
    {
      "type": "literal",
      "value": "AND",
      "description": "\"AND\""
    },
    {
      "type": "literal",
      "value": "ORDER BY",
      "description": "\"ORDER BY\""
    },
    {
      "type": "end",
      "description": "end of input"
    }
  ],
  "found": "A",
  "offset": 9,
  "line": 1,
  "column": 10,
  "name": "SyntaxError"
}

What it's saying is that it parsed characters 0–9 of the string ("FOO > 10 ") but then encountered an unexpected token at character 10. And it gives you a list of the next tokens it was expecting: FOO > 10 AND, FOO > 10 ORDER BY, FOO > 10. If you tack these onto the valid portion of the query, you'll get a good set of possible completions:

function getCompletions(pegParse, text) {
  var parsedText = pegParse(text);
  var completions = [];
  if (parsedText.expected) {
    var start = text.substr(0, parsedText.offset);
    parsedText.expected.forEach(function(expected) {
      if (expected.type != 'literal') return;
      var completion = start + expected.value;
      if (completion.substr(0, text.length) == text) {
        completions.push(completion);
      }
    });
  }
  return completions;
}

This is quite simplistic -- a real autocomplete would match more than just literals and would need some way to take advantage of context not available to the grammar, e.g. the list of arguments to a function that the user is calling.

hey how did you get the structured error message? I don't see any public API other than `.parse` — Ray Wu, Dec 04 '17 at 18:18
nvm I found I could get that by `try { p.parse(...); } catch (e) { console.log(e); }` — Ray Wu, Dec 04 '17 at 23:47

CFG / PEG used for Code completion?

2 Answers2