0

I have a string having this form :

<div class="c1">text1</div></br>
<div class="c2">text2</div></br>
<div class="c3">text3</div></br>

I want to create a NameValueCollection with c# and regular expression like this

 { ("c1","text1"),("c2","text2"),("c3","text3") }.

Right now I can get only the "text" like this

 Match match = Regex.Match(inputString, "[^<>]+(?=[<])");

Can someone help me to get both the class and the innertext?

Thanks

Glory Raj
  • 17,397
  • 27
  • 100
  • 203
Bes-m M-bes
  • 177
  • 1
  • 4
  • 16
  • 3
    A great library for dealing with HTML within C# is http://htmlagilitypack.codeplex.com/ if you are able to leverage something like that. – Aaron McIver Dec 22 '11 at 22:40
  • http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags – adt Dec 22 '11 at 22:42

1 Answers1

2

I agree about the agility pack, but this answers your question. Pattern commented and the output of the matches put into a dictionary for easy extraction. HTH

string data = @"
<div class=""c1"">text1</div></br> 
<div class=""c2"">text2</div></br> 
<div class=""c3"">text3</div></br> 
";

string pattern = @"
(?:class\=\x22)  # Match but don't capture the class= quote
(?<Key>\w+)      # Get the key value
(?:\x22>)        # MBDC the quote and >
(?<Value>[^<]+)  # Extract the text into Value named capture group
";

// Ignore allows us to comment the  pattern; it does not affect regex processing!
Regex.Matches(data, pattern, RegexOptions.IgnorePatternWhitespace)
     .OfType<Match>()
     .ToDictionary (mt => mt.Groups["Key"], mt => mt.Groups["Value"] )
     .ToList()
     .ForEach(kvp => Console.WriteLine ("Key {0} Value {1}", kvp.Key, kvp.Value));

/* Output
Key c1 Value text1
Key c2 Value text2
Key c3 Value text3
*/
ΩmegaMan
  • 29,542
  • 12
  • 100
  • 122