I'm having trouble with some Regex code while scraping YouTube playlist pages. It mostly works fine but its picking up a couple of strange results
Expression:
(?<=v=)[a-zA-Z0-9-_]+(?=&)|(?<=[0-9]/)[^&\n]+|(?<=v=)[^&\n]+
Examples of what to pick out:
yXBckFyiMyU,
opWYnUpNtG8,
YFbLRZCExBk,
I_GZahAl-PQ,
G6F_iP-F7Fw
from links like this
https://www.youtube.com/watch?v=_ClmClS_Mqs&list=PL6422619E56951B73&index=5&feature=plpp_video
For the most part this appears to be working okay, however it is also picking up these instances
data-thumb="//i1.ytimg.com/vi/84GVRtJ1CvY/<FROM RIGHT ONWARDS IS WHAT IT MATCHES>default.jpg" ><span class="vertical-align"></span></span></span></span>
data-thumb="//i4.ytimg.com/vi/WNIPqafd4As/<FROM RIGHT ONWARDS IS WHAT IT MATCHES>default.jpg" alt="" class="thumb"></span></span></span><span class="clip"><span class="centering-offset"><span class="centering"><span class="ie7-vertical-align-hack">
Regex is rather daunting. Does anyone know whats wrong with the expression?