0

I'm parsing a long string and want to use a regular expression for part of the parsing.

For simplicity, let's say my regex is <[a-z]*> and I'd like to run it when I get to the first <.

public int FindEnd(string longStr, int index) {
    // longStr[index] == '<'

    var match = regex.Match(longStr, index);
    if (!match.Success || match.Index != index) {
        throw new Exception("Mismatch");
    } else {
        return index + match.Length;
    }
}

I'd like to constrain the regex somehow so that it doesn't go over the entire string, but only looks for strings at the given starting point - is this possible? I tried ^<[a-z]*> but that didn't work - it wouldn't accept anything (except if index points to the start of the string).

Note: I'm not trying to parse HTML with a regex.

Community
  • 1
  • 1
configurator
  • 40,828
  • 14
  • 81
  • 115

2 Answers2

2

I think you're looking for \G<[a-z]*>

Joel Rondeau
  • 7,486
  • 2
  • 42
  • 54
0

It's a pity Regex.Match(String, Int32) doesn't treat the index as "^".

What about kludging it with something like this:

re = new Regex( "^.{" + index.ToString() + "}<[a-z]*>" );

...that is to say, constrain the offset of the start of the match within the regex itself.

UPDATE: Oh. Never mind. MSDN's description of "\G" mentions a "previous match", but it works precisely as described above. Much better solution than mine.