I'm attempting to preg match a link of which is half in English, half in Arabic.
The link as an example looks like:
"/<arabic>/123/<arabic>-<english>.html"
The basic preg_match('@<a href="/(.*?).html" >);
returns everything back however the Arabic within the URL means that it is no longer identifiable to a page, returning "دانلود-رایÚ"
for example.
I've attempted some things I've seen such as \p{Arabic}
however this returns nothing. Is there a way to be able to capture these links?
It's something I'm pretty stumped with and can't figure out a way around this issue.
Edit to add preg match & what I'm attempting to match.
preg_match_all('@<a href="/\p{Arabic}/(.*?)/\p{Arabic}-(.*?)" >@iu',$page,$link);
example text -
"a href="/دانلود-رایگان-کتاب/کتاب-های-خارجی/مطلب/2120-the-essential-financial.html"