3

There are many times I need to extract value of an element from a HTML page. Something like this:

<!-- many html here -->
<input type="hidden" name="id" value="ExtractMe!">
<!-- many html here -->

How can extract the value easily?

Xaqron
  • 29,931
  • 42
  • 140
  • 205

4 Answers4

4

Have a look at the HTMLAgility pack, it makes this type of task very easy and regex-free.

Chris S
  • 64,770
  • 52
  • 221
  • 239
1

If you need to parse HTML within your C# application consider using HTMLAgilityPack from here http://htmlagilitypack.codeplex.com/

DanielB
  • 19,910
  • 2
  • 44
  • 50
0

If you just want to pluck values you're probably best to parse this as XML. You have a choice of standard XML or LINQ.

have a look here or here for some examples.

Community
  • 1
  • 1
Antony Woods
  • 4,415
  • 3
  • 26
  • 47
0

Why don't you use regular expressions? This the MSDN Regular Expression Documentation, in there you can look for The section Extracting a Single Match or the First Match.

PedroC88
  • 3,708
  • 7
  • 43
  • 77
  • Regex eats CPU and implementation is not easy. – Xaqron May 10 '11 at 14:45
  • 1
    "Regex eats CPU" - so are you planning on running this 100,000s of times? Is performance a factor? – Lee Gunn May 10 '11 at 14:53
  • Implementation of Regex in .NET is fairly easy, the classes are there, perhaps writing the right pattern expression is trickier but there are tools (and stackoverflow) to help you with that. As to the performance part, @Lee Gunn asked the right questions. – PedroC88 May 10 '11 at 14:58