Not sure if I'm reading this right, but something like below is comparable to what I think your trying to do. Caveat being that regex and html probably don't mix. But for html text chunks it should be fine.
When looking to find a specific att-val within a tag, I tend to favor a lookahead that locates it anywhere within the tag, and safe enough that it won't overrun boundries.
Used preg_match_all() as an example. Test case here http://ideone.com/oerbc
(fixed relative backref, should be -2)
$html = '
<a href="http://www.exemple.net/index.php?p[some stuff to find]/1/">
<a href=\'http://www.exemple.net/index.php?p[more stuff to find]/1/ \'>
';
$ref_txtstart = 'http://www.exemple.net/index.php?p';
$ref_txtend = '/1/';
$regex =
'~
<a
(?=\s)
(?= (?:[^>"\']|"[^"]*"|\'[^\']*\')*? (?<=\s) href \s*=
(?>
\s* ([\'"]) \s*
' . preg_quote($ref_txtstart) . '
(?<core>(?:(?!\g{-2}).)*)
' . preg_quote($ref_txtend) . '
\s* \g{-2}
)
)
\s+ (?:".*?"|\'.*?\'|[^>]*?)+
>~xs
';
echo ("$regex\n");
preg_match_all( $regex, $html, $matches, PREG_SET_ORDER );
foreach ($matches as $val) {
echo( "matched = $val[0]\ncore = $val[core]\n\n" );
}
?>
Output
~
<a
(?=\s)
(?= (?:[^>"']|"[^"]*"|'[^']*')*? (?<=\s) href \s*=
(?>
\s* (['"]) \s*
http\://www\.exemple\.net/index\.php\?p
(?<core>(?:(?!\g{-2}).)*)
/1/
\s* \g{-2}
)
)
\s+ (?:".*?"|'.*?'|[^>]*?)+
>~xs
matched = <a href="http://www.exemple.net/index.php?p[some stuff to find]/1/">
core = [some stuff to find]
matched = <a href='http://www.exemple.net/index.php?p[more stuff to find]/1/ '>
core = [more stuff to find]
also
This can be extended to include unquoted values by using a branch reset and
changing the named capture buffer to the fixed index of the capture buffer in question.
So $val[core]
becomes $val[2]
. Example is here http://ideone.com/IHHLg
Extended regex
$regex =
'~
<a
(?=\s)
(?= (?:[^>"\']|"[^"]*"|\'[^\']*\')*? (?<=\s) href \s*=
(?|
(?>
\s* ([\'"]) \s*
' . preg_quote($ref_txtstart) . ' ((?:(?!\g{-2}).)*) ' . preg_quote($ref_txtend) . '
\s* \g{-2}
)
|
(?>
(?!\s*[\'"]) \s* ()
' . preg_quote($ref_txtstart) . ' ([^\s>]*) ' . preg_quote($ref_txtend) . '
(?=\s|>)
)
)
)
\s+ (?:".*?"|\'.*?\'|[^>]*?)+
>~xs
';