0

So I have this one string, which contains multiple occurrences of a substring. All of these strings have the following format: <c@=someText>Content<c>

Example:

This combination of plain text and <c=@flavor> colored text<c> is valid. <c=@warning>Multiple tags are also valid.<c>

I want to extract each of the substrings via regex. However if I use the following regex <c=@.+?(?=>)>.*<c> It matches everything from the first <c... to the last <c>. What I want is each of those substrings as one item. How can I do this and if I can't do it with regex, what would be the best way to achieve my goal.

Jonesopolis
  • 25,034
  • 12
  • 68
  • 112
Ruhrpottpatriot
  • 1,058
  • 3
  • 17
  • 31

2 Answers2

1

You can use named capture groups, along with lookaheads and lookbehinds, to grab the 'type' and 'text':

var pattern = @"(?<=<c=@)(?<type>[^>]+)>(?<text>.+?)(?=<c>)";
var str = @"This combination of plain text and <c=@flavor> colored text<c> is valid. <c=@warning>Multiple tags are also valid.<c>";

foreach (Match match in Regex.Matches(str, pattern))
{
   Console.WriteLine(match.Groups["type"].Value);
   Console.WriteLine(match.Groups["text"].Value);

   Console.WriteLine();
}

output:

flavor
 colored text

warning
Multiple tags are also valid.

pattern:

(?<=<c=@) : Look for <c=@

(?<type>[^>]+)> : Grab everything until a >, call it type

(?<text>.+?) : Grab everything until the lookahead, call it text

(?=<c>) : Stop when you find a <c>

Jonesopolis
  • 25,034
  • 12
  • 68
  • 112
1
string input = @"This combination of plain text and <c=@flavor> colored text<c> is valid. <c=@warning>Multiple tags are also valid.<c>";

var matches = Regex.Matches(input, @"<c=@(.+?)>(.+?)<c>")
                .Cast<Match>()
                .Select(m => new
                {
                    Name = m.Groups[1].Value,
                    Value = m.Groups[2].Value
                })
                .ToList();
EZI
  • 15,209
  • 2
  • 27
  • 33
  • This is exactly what I'm looking for. While @Jonesy also has good code, yours is more clear and has no strings to select the groups. – Ruhrpottpatriot Nov 13 '14 at 20:42