2

I have made a delegate to process some html code. But I can only match the first matching. But the Match handler won't proceed. It keeps looping at the first match. Can anyone tell me why? But why I move the match while loop outside the delegate, everything is OK.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
using System.Web;
namespace migration
{
    class Program
    {
        static void Main(string[] args)
        {
            string input = "<a class=\"normal\" href=\"http://msdn.microsoft.com/\" title=\"Home\">Home</a><a class=\"active\" href=\"http://msdn.microsoft.com/library/default.aspx\" title=\"Library\">Library</a><a class=\"normal\" href=\"http://msdn.microsoft.com/bb188199\" title=\"Learn\">Learn</a><a class=\"normal\" href=\"http://code.msdn.microsoft.com/\" title=\"Samples\">Samples</a><a class=\"normal\" href=\"http://msdn.microsoft.com/aa570309\" title=\"Downloads\">Downloads</a><a class=\"normal\" href=\"http://msdn.microsoft.com/hh361695\" title=\"Support\">Support</a><a class=\"normal\" href=\"http://msdn.microsoft.com/aa497440\" title=\"Community\">Community</a><a class=\"normal\" href=\"http://social.msdn.microsoft.com/forums/en-us/categories\" title=\"Forums\">Forums</a>";


            HTMLStringWalkThrough(input, "<a.+?</a>", "", PrintTest);


        }
        public static string HTMLStringWalkThrough(string HTMLString, string pattern, string replacement, HTMLStringProcessDelegate p)
        {
            StringBuilder sb = new StringBuilder();
            Match m = Regex.Match(HTMLString, pattern);

            while (m.Success)
            {
                string temp = m.Value;
                p(temp, replacement);
                m.NextMatch();
            }
            return sb.ToString();
        }
        public delegate void HTMLStringProcessDelegate(string input, string replacement);
        static void PrintTest(string tag, string replacement)
        {
            Console.WriteLine(tag);
        }
    }
}
//output:
//<a class=\"normal\" href=\"http://msdn.microsoft.com/\" title=\"Home\">Home</a>
//<a class=\"normal\" href=\"http://msdn.microsoft.com/\" title=\"Home\">Home</a>
//<a class=\"normal\" href=\"http://msdn.microsoft.com/\" title=\"Home\">Home</a>
//<a class=\"normal\" href=\"http://msdn.microsoft.com/\" title=\"Home\">Home</a>
//<a class=\"normal\" href=\"http://msdn.microsoft.com/\" title=\"Home\">Home</a>
//<a class=\"normal\" href=\"http://msdn.microsoft.com/\" title=\"Home\">Home</a>
//<a class=\"normal\" href=\"http://msdn.microsoft.com/\" title=\"Home\">Home</a>
//.........
Alan Moore
  • 73,866
  • 12
  • 100
  • 156
Hoy Cheung
  • 1,552
  • 3
  • 19
  • 36
  • 1
    [obligatory warning not to using regex to parse HTML content](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) – Servy Apr 29 '13 at 15:35

3 Answers3

4

You need to assign Match.NextMatch to your variable. It returns the next match, and doesn't change the current Match:

 m = m.NextMatch();
Reed Copsey
  • 554,122
  • 78
  • 1,158
  • 1,373
4

NextMatch returns the next match but you don't use its return value at all. Change this and your code should work:

m = m.NextMatch();

Please see the documentation, specifically the Note in the Remarks section:

This method does not modify the current instance. Instead, it returns a new Match object that contains information about the next match.

Daniel Hilgarth
  • 171,043
  • 40
  • 335
  • 443
1

Try changing it to

        while (m.Success)
        {
            string temp = m.Value;
            p(temp, replacement);
            m = m.NextMatch();
        }
Justin Harvey
  • 14,446
  • 2
  • 27
  • 30