-1

I've been reading on MSDN on regular expressions in .NET but I'm having trouble figuring out what the right pattern is. I need to extract ' width="200" height="200" />' from an HTML file. I need the quotes along with it. What's the correct pattern I should be using?

1 Answers1

1

Given a specific HTML page, you can craft a regex that can pull the attributes for that page. But if you have only one specific page, you can just hard-code its attribute values. But you probably want to be able to pull the attribute values from any page, right? You can't do that with regular expressions. Really, you can't, and trying to do so will lead you into an infinite loop of failure.

Use the HTML Agility Pack; it's designed to do exactly what you asked; even with ill-formed real-world HTML.

Dour High Arch
  • 21,513
  • 29
  • 75
  • 90
  • The HTML file is always the same and always has the same attributes. I just need to match ' width="200" height="200" ' in the file. I looked into the HTML Agility Pack. Thanks for the heads up! – user2510712 Jul 09 '13 at 00:50
  • 2
    If they are always the same, then hard-code `200` in your application. If that won't work then regexes will not work and you will have to use the Agility Pack. – Dour High Arch Jul 09 '13 at 00:54
  • Alright thanks. I'm gonna have to approach this different... – user2510712 Jul 09 '13 at 01:01