3

The final working solution that takes into account line and column ranges:

(csharp
 "^ *\\(?:[0-9]+>\\)*\\(\\(?:[a-zA-Z]:\\)?[^:(\t\n]+\\)(\\([0-9]+\\),\\([0-9]+\\),\\([0-9]+\\),\\([0-9]+\\)) *\: \\(error\\|warning\\) *CS[0-9]+:)"
 1 (2 . 4) (3 . 5) )

Both answers below were incredibly helpful; I understand the system a lot better now.


Summary: my regexps work to match the output strings, but don't work in the compilation-error-regexp-alist-alist to match errors in my compilation output.

I'm finding the compilation mode regexps a bit confusing. I've written a regex that I know works on my error string using rebuilder and the original regexes that are in compile.el.

40>f:\Projects\dev\source\Helper.cs(37,22,37,45): error CS1061: 'foo.bar' does not contain a definition for 'function' and no extension method 'method' accepting a first argument of type 'foo.bar' could be found (are you missing a using directive or an assembly reference?)

And here's my regexp:

(pushnew '(csharp
 "^ *\\(?:[0-9]+>\\)*\\(\\(?:[a-zA-Z]:\\)?[^:(\t\n]+\\)(\\([0-9]+\\),\\([0-9]+\\),[0-9]+,[0-9]+) *\: \\(?:error *CS[0-9]+:\\)"
 2 3)
     compilation-error-regexp-alist-alist)

Obviously, I'm just trying to get to the first line/column pair that's output. (I'm surprised that the compiler is outputting 4 numbers instead of two, but whatever.)

If we look at the edg-1 regexp in compile.el:

    (edg-1
 "^\\([^ \n]+\\)(\\([0-9]+\\)): \\(?:error\\|warnin\\(g\\)\\|remar\\(k\\)\\)"
 1 2 nil (3 . 4))

So I guess where I'm confused is to how the arguments are passed. In edg-1, where are 3 and 4 coming from? I guess they don't correspond to the capture groups? If I run the edg-1 regexp through re-builder on a well-formed error message and enter subexpression mode, 0 matches the whole matching string, 1 matches the file name and path, and 2 matches the line number. From looking at the documentation (when I do M-x describe-variable), it appears as though it just cares about what place the subexpressions are in the main expression. Either way, I'm clearly misunderstanding something.

I've also tried modifying the official csharp.el regexp to handle the extra two numbers, but with no luck.

(Edit, fixed the example slightly, updated the csharp regexp)

RealityMonster
  • 1,881
  • 2
  • 12
  • 11
  • Some weird quoting rules there. How is the interpolation done? –  Jan 07 '15 at 23:45
  • Glad you got it working. The last text `\\)`" in your final regex is probably not needed, will give an unbalanced capture group error. –  Jan 08 '15 at 19:30
  • @sln You're almost certainly right and I have no idea why the system didn't complain about it. – RealityMonster Jan 08 '15 at 19:54

2 Answers2

2

Found some info on this.

This page has a simplified explanation:
http://praveen.kumar.in/2011/03/09/making-gnu-emacs-detect-custom-error-messages-a-maven-example/

Quote from page -

"Each elt has the form (REGEXP FILE [LINE COLUMN TYPE HYPERLINK
HIGHLIGHT...]).  If REGEXP matches, the FILE'th subexpression
gives the file name, and the LINE'th subexpression gives the line
number.  The COLUMN'th subexpression gives the column number on
that line"

So it looks like the format is something like this:

(REGEXP FILE [LINE COLUMN TYPE HYPERLINK HIGHLIGHT...])

Looking at the regex again, it looks like a modified BRE.

 ^                   # BOS
 \( [^ \n]+ \)       # Group 1

 (                   # Literal '('
 \( [0-9]+ \)        # Group 2
 )                   # Literal ')'

 : [ ] 

 \(?:
      error
   \|
      warnin\(g\)    # Group 3
   \|
      remar\(k\)     # Group 4
 \)

Here is the edg-1

(edg-1
 "^\\([^ \n]+\\)(\\([0-9]+\\)): \\(?:error\\|warnin\\(g\\)\\|remar\\(k\\)\\)"
 1 2 nil (3 . 4))

Where

"^\\([^ \n]+\\)(\\([0-9]+\\)): \\(?:error\\|warnin\\(g\\)\\|remar\\(k\\)\\)"
REGEXP ^^^^^^^^

 1     2    nil    (3 . 4)
 ^     ^     ^      ^^^^^
FILE LINE  COLUMN   TYPE

"TYPE is 2 or nil for a real error or 1 for warning or 0 for info.
TYPE can also be of the form (WARNING . INFO).  In that case this
will be equivalent to 1 if the WARNING'th subexpression matched
or else equivalent to 0 if the INFO'th subexpression matched."

So, TYPE is of this form (WARNING . INFO)

In the regex,
if capture group 3 matched (ie. warnin\(g\) ) it is equivalent to a warning.
If capture group 4 matched (ie. remar\(k\) ) it is equivalent to info.  
One of these will match.  

csharp element info

Looking at your csharp element

"^ *\\(?:[0-9]+>\\)?\\(\\(?:[a-zA-Z]:\\)?[^:(\t\n]+\\)(\\([0-9]+\\),\\([0-9]+\\),[0-9]+,[0-9]+) *\: \\(?:error *CS[0-9]+:\\)"
2 3 4

And your regex (below) actually doesn't have capture group 4 in it.
So, your FILE LINE COLUMN of 2 3 4
probably should be 1 2 3

Here is your regex as its engine see's it -

 ^ 
 [ ]* 
 \(?:
      [0-9]+ > 
 \)?
 \(                            # Group 1
      \(?:
            [a-zA-Z] : 
      \)?
      [^:(\t\n]+ 
 \)
 (                             # Literal '('
      \( [0-9]+ \)                # Group 2
      ,
      \( [0-9]+ \)                # Group 3
      ,
      [0-9]+
      ,
      [0-9]+ 
 )                             # Literal ')'
 [ ]* \: [ ] 
 \(?:
      error [ ]* CS [0-9]+ :
 \)
  • Interesting, this makes some sense. I'm not super awesome with regexps, so understanding the breakdown of how they're intended to be read is some help. I'll try doing the same analysis on my csharp regex. Thanks! – RealityMonster Jan 08 '15 at 15:24
  • 1
    @RealityMonster - Updated with new info. I was way off before. I think I found a solution for you pertaining to the _TYPE_ field. All these do is parse a warning/error compiler messages. It makes sense they would try to pack user control and formatting options in the commands. Sort of like a printf(). –  Jan 08 '15 at 18:36
  • I've been reading over the document too, and I came to the same conclusions. It still doesn't work, though. I tried to simplify things and just feed a filename and a line number into the system, and ignore things like columns and type. Based on that, my csharp regexp should pass in 2 for the filename (because \(?:[0-9]+>\) captures the first spot) and 3 for the line number, but it still doesn't match. (I've also tried 1 for filename and 2 for linenumber in case there are skipping rules that I need to account for. No luck there either.) – RealityMonster Jan 08 '15 at 18:46
  • And, to be clear, re-builder shows that my csharp regexp matches the sample bad line, and the different groups. I wish there were an easier way to show which groups the compiler was matching. – RealityMonster Jan 08 '15 at 18:53
  • @RealityMonster -Your `csharp` element is wrong, I'm going to post some info on it. –  Jan 08 '15 at 18:54
  • @RealityMonster - Also it appears the file,line,column,type,.. fields are optional except for the `file`. So if you want to flesh out fields downstream of `file` but don't have the info, use `nil` which seems to be a universal _nop_ for all their formats. –  Jan 08 '15 at 19:08
1

My crystal ball came up with a weird explanation: compilation-error-regexp-alist-alist is just a collection of matching rules, but it doesn't say which rules to use. So you need to add csharp to compilation-error-regexp-alist if you want to use your new rule.

As for the meaning of (3 . 4), see C-h v compilation-error-regexp-alist.

Stefan
  • 27,908
  • 4
  • 53
  • 82
  • I went to double check what you said, and I THOUGHT I'd added the csharp regexp to the list but I hadn't. But my regexp wasn't quite right to begin with. So between you and sln, my problem is solved. Thanks! – RealityMonster Jan 08 '15 at 18:59