1

Ok if I have this url:

 <iframe width="510" height="400" src="http://xhamster.com/xembed.php?video=XXXXXX" frameborder="0" scrolling="no"></iframe>

I can get the video id with

preg_match('/video=([a-zA-Z0-9]+)/', $url, $url_data);

how do I do the same with this url:

<iframe src="http://flashservice.xvideos.com/embedframe/XXXXX" frameborder=0 width=510 height=400 scrolling=no></iframe>

XXXX is the id

I’m really not sure what im doing with regular expressions

user794846
  • 1,881
  • 5
  • 29
  • 72

2 Answers2

1
preg_match('/src=".*\/([a-zA-Z0-9]+)"/', $url, $url_data);

Or since src could be in caps, add case insensitive:

preg_match('/src=".*\/([a-zA-Z0-9]+)"/i', $url, $url_data);

Another improvement would be to avoid overly greedy matches where other attribute fields in the url might have the "/" character, modify to this:

preg_match('/src=".*?\/([a-zA-Z0-9]+)"/i', $url, $url_data);
John McMahon
  • 1,605
  • 1
  • 16
  • 21
1

Take a look at the string you are trying to capture and notice the difference. The first has ?video=. The second one has a different structure. Try something like this:

preg_match('/embedframe\/([a-zA-Z0-9]+)/', $url, $url_data);
nomistic
  • 2,902
  • 4
  • 20
  • 36
  • This worked. I knew it had a different structure, its just understanding how to capture it I was struggling with. I guess I better go study. – user794846 Apr 21 '15 at 00:14
  • 1
    One of the major challenges with using regex is that there are tons of ways it can break if the input string changes only slightly. If you know the url will always contain the string "embedframe/" before your id this solution is great. Just be aware of the assumptions your regex is making about the input. – John McMahon Apr 21 '15 at 00:19
  • @JohnMcMahon This is absolutely the case. Since this code is looking for items specifically within an iframe, it might be good to start with that (as `src` can also reference other media types). My guess, looking at the purpose of what he is doing, I think he only needs the data from a few specific sites. – nomistic Apr 21 '15 at 00:56