Syntax highlighting for first word in line (grammar package for Atom)

Question

I'm working on improving the existing grammar for Stata for use on Atom, the language-stata package. Stata code follows a pattern: the first word in a line is a command and a comma separates options from the objects of the command. For example, to run a linear regression of y on x without a constant, you run:

regress y x, noconstant

A triple slash means that the command continues in the following line. Thus the previous code is equivalent to:

regress x /// COMMENTS
y, /// MORE COMMENTS
noconstant

I think that the grammar should highlight every first word of a line, unless the previous line contains a triple slash. In the two examples above, it should highlight the command regress, but it should not highlight the words y or noconstant in the second example. I imagine something like:

Start capturing at the beginning of a line;
Highlight the first word;
Continue capturing as long as lines contain a triple slash;
Stop when I find the end of a line without a triple slash.

I've tried a few things. For instance:

{
    name: 'comment.line.stata'
    match: '///.*'
}
{
    begin: '^\\s*(\\w+)'
    end: '(?<!///)$'
    beginCaptures:
        "1":
            name: 'support.function.stata'
}

This code highlights the first word of every line, whether or not a triple slash preceded it. On the other hand,

{
    name: 'comment.line.stata'
    match: '///.*'
}
{
    begin: '^\\s*(\\w+)'
    while: '///'
    beginCaptures:
        "1":
            name: 'support.function.stata'
}

highlights the first word of the document and nothing else.

Does anyone have an idea to solve this? Thanks!

But it does not recognize any pattern if I use `regress y x, noconstant`. — Luca B., Apr 09 '16 at 03:21
I tried to clarify and simply the problem a bit. I removed the part about the comma. Does it seem clearer? — Luca B., Apr 09 '16 at 13:29
What do you want to highlight in the first example: `regress y x, noconstant`? Please try [this pattern](https://regex101.com/r/vL4oW6/5), but it is not so smart:( — Quinn, Apr 09 '16 at 18:43
Your code inspired me to find a solution! I posted it below. I just had to generalize your code a bit. — Luca B., Apr 09 '16 at 20:22
I'm not familiar with Stata. Nice to see you got an answer finally. :) — Quinn, Apr 09 '16 at 20:31

score 0 · Answer 1 · edited May 23 '17 at 12:15

Building on comments by ccf and this discussion, I came up with something that seems to work:

{
    name: 'comment.line.stata'
    match: '///.*'
}
{
    begin: '^\\s*([a-zA-Z_]+)(.*)(///.*)$'
    beginCaptures:
        "1":
            name: 'support.function.stata'
    end: '^([^/]+|/($|[^/]|/($|[^/])))*$'
}
{
    captures:
        "1":
            name: 'support.function.stata'
    match: '^\\s*([a-zA-Z_]+)'
}

Syntax highlighting for first word in line (grammar package for Atom)

1 Answers1