-1

I am working on an old website, and i need to fix my youtube links. for example, i have a string variable with the following content:

<h1>title</h1>
<p>
some text here
.......
<iframe src="http://www.youtube.com/embed/suEGD8aaSzI?list&playauto=1" width="560" height="315" frameborder="0" scrolling="auto"></iframe>
.......
</p>
</p>

i try to get the parameters "suEGD8aaSzI?list&playauto=1" outside, to do the following:

lblContent.Text = Regex.Replace(ArticleContent, @"myRegularExpressionHere", "https://www.youtube.com/embed/$1", RegexOptions.IgnoreCase);

so far, its the best what i could find:

https?:\/\/(?:[0-9A-Z-]+\.)?(?:youtu\.be\/|youtube\.com\/(?:embed\/|v\/|watch\?v\=))([\w-]{10,12})(?:[\&\?\#].*?)*?(?:[\&\?\#]t=([\dhm]+s))?(?=")

but it is not enough, because i get only the "suEGD8aaSzI" as a parameter, the rest of the query string: "?list&playauto=1" is not included inside.

help will be very appreciated.

serg90
  • 31
  • 1
  • 10
  • try splitting the url string on (/) and last index will give you the complate requiored string. but this will work o only if you do not have any slashes in your querystring. – Manish Nov 24 '16 at 11:21
  • Basically query starts from the **?** sign so you can do **meUrl = meUrl.Substring(meUrl.LastIndexOf("?"))** to get only the query string. Then you can parse querystring just by **var query = meUrl.Split("&", SplitOptions.IgnoreEmptyEntities)**. Now variable **query** should contains something like **{ "q1=123", "q2=somehting" }**. – mrogal.ski Nov 24 '16 at 11:26
  • the problem is that i have a whole html content inside this string variable, so i still need to extract these youtube links outside. – serg90 Nov 24 '16 at 11:46
  • @serg90 That is actualy not a problem. You just have to use some regex like `(?'iframe' – mrogal.ski Nov 24 '16 at 12:04

3 Answers3

0

I think this would work if you want a regex:

^.+/([^/]+)$

It basically says take everything after the last '/' character.

MrApnea
  • 1,776
  • 1
  • 9
  • 17
  • doesn't work for me. also, it must stop at " symbol so the other html tags and attributes will not be corrupted. – serg90 Nov 24 '16 at 11:44
  • Sorry. Somehow missed that it was inside html. I think the answer you are looking for are here: http://stackoverflow.com/questions/3717115/regular-expression-for-youtube-links – MrApnea Nov 25 '16 at 06:07
0

You can use this method:

const string PATTERN = @"(?'iframe'<iframe .+(?'link'youtube.com\/embed\/.+?)\")";

Match match = new Regex(PATTERN, RegexOptions.Multiline).Match(meUrl);
if(match.Success){
    string link = match.Groups["link"].Value;
    // link is now youtube.com/embed/suEGD8aaSzI?list&playauto=1
    string query = link.Substring(link.LastIndexOf("?") + 1);
    // query is now list&playauto=1
    string[] splittedQuery = quert.Split("&", StringSplitOptions.IgnoreEmptyEntries);
    // splittedQuery is not { "list", "playauto=1" }
    Dictionary<string, string> fullQueryWithValues = new Dictionary<string,string>();
    foreach(string queryFromSplit in splittedQuery){
        KeyValuePair<string, string> queryWithValues = new KeyValuePair<string, string>(queryFromSplit.Split("=", StringSplitOptions.IgnoreEmptyEntries)[0], queryFromSplit.Contains("=") ? queryFromSplit.Split("=", StringSplitOptions.IgnoreEmptyEntries)[1] : string.Empty);
    }
}

Online regex check tool

This was written from head so it can have some issues. Will rewrite this when i get back home :)

mrogal.ski
  • 5,828
  • 1
  • 21
  • 30
0

May I suggest that it might help to break the problem down into smaller steps. For example, if you used an HTML parser you would be able to navigate the content without needing to worry about un-escaping values that are only escaped because they are in an XML-like format. Then you could pass the "src" attributes (and whatever else might have the links) into the constructor of System.Uri, and pick out whichever bits of that URI you need. And something like System.Web.HttpUtility.ParseQueryString would help you process the arguments.

All of that being said, if you just want something rough-and-ready, based on the example you've given, I'd suggest this, which is based around looking for a quoted string after "src=" (i.e. I'm assuming that the URI does not contain a double-quote, which I am fully aware is not a reasonable assumption).

Regex pattern = new Regex(@"\ssrc\s*=\s*""([^""]+)""", RegexOptions.IgnoreCase);
Match match = pattern.Match(example);
string value = match.Result("$1");

Then you can put the value in the Uri constructor, and parse as I described above.

Richardissimo
  • 5,596
  • 2
  • 18
  • 36