1

I have the following HTML as input:

<p>Hello</p>
<p>How are you?</p>
<div>Hello again</div>

How can I only output "Hello" from this? (only content from the first p-tag). And how can I also access only the second p-tag content?

So the output should be:

string p1 = "Hello"
string p2 = "How are you?"

My code so far. Full error!!! Help!

using System.Text.RegularExpressions;
string p1 = Regex.Match("<p>(.*?)</p>"[0], myString);
string p2 = Regex.Match("<p>(.*?)</p>"[1], myString);
Lee Cheung
  • 101
  • 2
  • 9
  • i looked at it but it doesnt tell me how i can access specifically which tag – Lee Cheung May 08 '19 at 23:03
  • 1
    Why would you regex an HTML doc? You have plenty of tools for this even using the base [HtmlDocument](https://learn.microsoft.com/en-us/dotnet/api/system.windows.forms.htmldocument) class (with [GetElementsByTagName](https://learn.microsoft.com/en-us/dotnet/api/system.windows.forms.htmldocument.getelementsbytagname), for example). Or get [HtmlAgilityPack](https://html-agility-pack.net/). – Jimi May 08 '19 at 23:04
  • i work as a chef at a small restaurant. i am not a programmer. i just try to solve this small problem. i dont know how else i can do it – Lee Cheung May 08 '19 at 23:05
  • You swapped regex and input string. `Regex.Match(myString, "(?s)

    (.*?)

    ").Groups[1].Value`. To really parse HTML, you will have to learn some programming, or you'll fail in the long run.
    – Wiktor Stribiżew May 08 '19 at 23:09
  • Listen to what people are telling you. RegEx is _not_ a good tool to parse HTML. There are far better libraries such as HtmlAgilityPack that will make this job much easier and the code will work much better. These should be fairly easy to learn even with basic coding skills.. – Mike Christensen May 09 '19 at 01:11

2 Answers2

0

You could add an id="yourID" to each element then do a select like so:

Javascript:

let p1 = document.getElementById("element1").value 

HTML:

<p id="element1"> </p>
Christopher
  • 83
  • 1
  • 12
0

I think you might be looking for something like this:

Regex r = new Regex("<p>(.*?)<\\/p>");
string p1 = r.Matches(myString)[0].Groups[1].Value;
string p2 = r.Matches(myString)[1].Groups[1].Value;

The output looks like this:

Hello
How are you?

Keep in mind though this isn't the most bombproof method, iterating through the results might be useful to keep in mind going forward:

foreach (Match m in r.Matches(myString))
{
    Console.WriteLine(m.Groups[1].Value);
}
Jack Casey
  • 1,628
  • 11
  • 18