0

I have a value i am pulling from a database

 <iframe width="420" height="315" src="//www.youtube.com/embed/8GRDA1gG8R8" frameborder="0" allowfullscreen></iframe>

I am trying to get the src as a value using regex.

Regex.Match(details.Tables["MarketingDetails"].Rows[0]["MarketingVideo"].ToString(), "\\\"([^\\\"]*)\\\"").Groups[2].Value

that is how i am currently writing it

How would I write this to pull the correct value of src?

Corey Toolis
  • 307
  • 1
  • 3
  • 21
  • Why exaclty are you willing to use `Regex` here? Since is has an XML structure, why not pass it to an `XDocument` instance? – Dion V. Sep 24 '14 at 19:17
  • 1
    [Obligitory "don't parse html with regex" link](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags). Really, use an html parser instead. – gunr2171 Sep 24 '14 at 19:18
  • Ofcourse, this is HTML, stupid me. Been working with XML much lately! Better to use an HTML parser indeed. – Dion V. Sep 24 '14 at 19:27
  • possible duplicate of [What is the best way to parse html in C#?](http://stackoverflow.com/questions/56107/what-is-the-best-way-to-parse-html-in-c) – Zack Sep 24 '14 at 19:37

2 Answers2

1

You could do it like this....

Match match = Regex.Match( @"<iframe width=""420"" height=""315"" src=""//www.youtube.com/embed/8GRDA1gG8R8"" frameborder=""0"" allowfullscreen></iframe>", @"src=(\""[^\""]*\"")");

Console.WriteLine (match.Groups[1].Value);

However, as others have already commented on your question... it's better practice to use an actual html parser.

Aydin
  • 15,016
  • 4
  • 32
  • 42
1

Don't use regex to parse xml or html. It's not worth it. I'll let you read this post, and it sort of exagerates the point, but the main thing to keep in mind is you can get into a lot of trouble with regex and html.

So, instead you should use an actual html/xml parser! For starters, use XElement, a class built into the .net framework.

string input = "<iframe width=\"420\" height=\"315\" src=\"//www.youtube.com/embed/8GRDA1gG8R8\" frameborder=\"0\" allowfullscreen=''></iframe>";

XElement html = XElement.Parse(input);
string src = html.Attribute("src").Value;

This will make src have the value //www.youtube.com/embed/8GRDA1gG8R8. You can then split that up to get whatever you need from it.

I should also note that your input is not valid xml. allowfullscreen does not have a value attached, which is why I added =''.

If you need to get more complex, such as your input, use an HTML parser (XElement is meant for xml). Use the Html Agility Pack like this (using the previous example):

HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(input);

string src = doc.DocumentNode
    .Element("iframe")
    .Attributes["src"]
    .Value;

This parser is more forgiving for invalid or incorrect (or just irregular) inputs. This will parse your original input just fine (so missing the ='').

Community
  • 1
  • 1
gunr2171
  • 16,104
  • 25
  • 61
  • 88
  • http://msdn.microsoft.com/en-us/library/ie/dn312070(v=vs.85).aspx according to microsoft, the allowfullscreen is true when it exists, and false when it doesn't exist, there is no value to assign. – Zack Sep 24 '14 at 19:31
  • @Zack, updated to include the html parser, which will parse the original input correctly. – gunr2171 Sep 24 '14 at 19:32
  • var videoCode = details.Tables["MarketingDetails"].Rows[0]["MarketingVideo"].ToString().Replace("\"","'"); doc.LoadHtml(videoCode); video.Attributes["src"] = doc.DocumentNode .Element("iframe") .Attributes["src"] .Value;` @gunr2171 this code will not work i keep getting an object reference error. IS there something I am missing? – Corey Toolis Sep 25 '14 at 15:37
  • @CoreyToolis, I don't know, you have not given me enough information to reproduce your issue. Try debugging to see which value is null. – gunr2171 Sep 25 '14 at 15:52
  • @gunr2171 the videoCode is not null it gives the iframe value i have listed above. doc.DocumentNode .Element("iframe") .Attributes["src"] .Value; is the null value. – Corey Toolis Sep 25 '14 at 15:54