0

Current code:

public static void WhoIsOnline(string worldName, WhoIsOnlineReceived callback)
    {
        string url = "http://www.tibia.com/community/?subtopic=worlds&world=" + worldName;
        HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(url);

        request.BeginGetResponse(delegate(IAsyncResult ar)
        {
            string html = GetHTML(ar);

            MatchCollection matches = Regex.Matches(html, @"<TD WIDTH=70%><[^<]*>([^<]*)</A></TD><TD WIDTH=10%>([^<]*)</TD><TD WIDTH=20%>([^<]*)</TD></TR>");
            List<CharOnline> chars = new List<CharOnline>(matches.Count);
            CharOnline co;

            for(int i = 0; i < matches.Count; i++)
            {
                co = new CharOnline();
                co.Name = Prepare(matches[i].Groups[1].Value);
                co.Level = int.Parse(matches[i].Groups[2].Value);
                co.Vocation = Prepare(matches[i].Groups[3].Value);
                chars.Add(co);
            }

            callback(chars);
        }, request);
    }

I was using this to scrape the online list, but they have changed their layout and I'm not sure how to change the regex to get the same information.

http://www.tibia.com/community/?subtopic=worlds&world=Libera

The link I am trying to use above.

Alex Mack
  • 3
  • 1
  • 4
    Why are you using regex to parse HTML? Take a look at the [HTML Agility Pack](http://htmlagilitypack.codeplex.com/), it does what you need, and more, in a robust way. – Tomalak Oct 23 '11 at 11:01
  • I'm trying to retrieve the player name, vocation, and level. Agility Pack will be able to do this easier? – Alex Mack Oct 23 '11 at 11:04
  • @Ales Yes. And fail-proof. And more maintainable (especially since regex does not seem to be your strong point). Probably even in less-lines of code. See [this question](http://stackoverflow.com/questions/846994/how-to-use-html-agility-pack) to get an overview how the Agility Pack works. – Tomalak Oct 23 '11 at 11:06

1 Answers1

0

As the others said, proper HTML parsing is much more robust and definitely a better way to go.

However, this should work:

MatchCollection matches = Regex.Matches(html, @"<a href="".*?subtopic=characters&name=.*?"".*?>(.*?)</a>.*?<td.*?>(\d+)</td><td.*?>(.*?)</td>);
Sebastian Paaske Tørholm
  • 49,493
  • 11
  • 100
  • 118