1

I have a string :

<a href = "http://www.zigwheels.com/reviews/long-term-reviews/fiat-linea/8804-100-1/1">
  <img src="http://static.zigwheels.com/media/content/2011/Jul/fiatlinealt_1_560x420.jpg" />
</a> 
<p>
  To sum it up in a nutshell, the Fiat Linea is a spacious family car that 
  rewards you with its space and fuel efficiency, while maintaining 
  decent levels of performance as well
</p>

I need just the text in the <p> tag. Please help... I need it in pure vb language for a vb.net windows application.

Kjartan
  • 18,591
  • 15
  • 71
  • 96
Jackson Lopes
  • 215
  • 1
  • 5
  • 20

2 Answers2

4

It depends on the input data, but for simple cases like that you could use a regular expression that matches the text between the tags.

Imports System.Text.RegularExpressions

Dim input As String = ... ' Your string
Dim match As Match = Regex.Match(input, "<p>(?<content>.*)</p>")
If match.Success Then
    Dim content As String = match.Groups("content").Value ' The text between <p> and </p>
End If

This is of course not a solution for parsing HTML, for that you need an HTML parser. But it can be used for matching very simple strings like the one you provided. If the string you're matching on is more complex or you need more complex matching then you need a different solution.

  • One thing.. I dont understand why would someone parse the HTML with regex when there is a valid HTML parser present ? – Pradip Apr 09 '13 at 06:14
  • 1
    @PradipKT: It's not parsing, it's matching. If you need parsing you should use a parser. If all you need to do is matching the content between two strings then you could use a regular expression. If you need anything more than that then you should use a parser. –  Apr 09 '13 at 06:33
1

You can use HTML Agility Pack. Here is an example

HtmlDocument htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml("Get the entire string here");
var xyz = from x in htmlDoc.DocumentNode.DescendantNodes()
                     where x.Name == "p"
                     select x.InnerText;

In this way you can get the value as required. You can get more help from the following link.

http://htmlagilitypack.codeplex.com/

EDIT :: VB.NET

Dim htmlDoc As New HtmlDocument()
htmlDoc.LoadHtml("Get the entire string here")
Dim xyz = From x In htmlDoc.DocumentNode.DescendantNodes() Where x.Name = "p"x.InnerText
Pradip
  • 1,507
  • 11
  • 28