1

Could some one help me out by providing the regex to detect a pattern in C#? The input string would be of the type-

<p><someURL></p>

I want to check whether the input has the content - <someURL> (with the angular bracket). So I need a regex to detect that

heijp06
  • 11,558
  • 1
  • 40
  • 60
  • 1
    Please elaborate your qstn little more. What is your requirement ? and what code snippet have you tried so far to achieve it. – Karan May 05 '15 at 11:38
  • I want to check whether the input string has the content- Somehow that part got missed out from the qn! –  May 05 '15 at 11:44
  • 2
    Rule 1: don't use RegEx to parse HTML. Rule 2: if you still want to parse HTML with RegEx, see rule 1. [RegEx can only match regular languages, and HTML is not a regular language](http://stackoverflow.com/a/590789/930393) – freefaller May 05 '15 at 11:46

2 Answers2

1

You can obtain the <URL> part between any <p>/</p> tags by using

var rxx = new Regex(@"</?p\b[^<]*>");
var reslt = rxx.Split("<p><someURL></p>")[1];

Output:

enter image description here

Mind that in case you have other tags, you will need to modify </?p\b[^<]*> regex. Also, if there are more tags, you will need to use Match:

rxx = new Regex(@"(?<=<p\b[^<]*>).*?(?=</p>)");
var reslt2 = rxx.Matches("<p><someURL></p><p><anotherURL></p>").Cast<Match>().ToList();

Output:

enter image description here

In case you have to deal with entire HTML/XML/SGML/ML and other .*ML texts, HtmlAgilityPack is the best way to go.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
0

Regex Pattern :-

\<(.*?)>

This will yield the groups of the text in between the angle brackets. And then in the foreach loop, you can retrieve the element text for the desired elements.

Sample - http://regexr.com/3aufj

OR

Use https://htmlagilitypack.codeplex.com/ to parse the html string into an object and navigate within the structure on the server side.

Karan
  • 3,265
  • 9
  • 54
  • 82