I run to this problem within kind of trivial task. HTML text should not contain chars '<' and '>' and '&'. The third is riddle for me. I want to use regular expression to find all '&' chars but this character could be contained in entity names, i.e. & which could be contained. So my requirements for regex is to find all '&' which aren't contained in format &[a-z]; I am not regex master so the best solution I figured out is this Regex:
Regex _allAmps = new Regex("((&[a-z]*;))|[&]", RegexOptions.Compiled | RegexOptions.IgnoreCase));
...
List<Match> invalidChars.AddRange(_allAmps.Matches(htmlText).Cast<Match>.Where()m => m.Value.Lenght == 1);
But this is improvisation. Regex matches all single chars and all entity names and kept are only single chars. Is there way how to compose such regular expression? I tried negative lookahead, but in that way regex matches all '&'chars.