I need to write a regular expression that scan the html code (the string) of an article in Wikipedia for links to other articles in Wikipedia.
The links usually look like these for example:
<a href="/wiki/English Language" title="English Language">English</a>
<a href="/wiki/Spanish Language" title="Spanish Language">Spanish</a>
I tried the regular expression: "<a.*href=(\"|')(.+?)(\"|')*wiki.*>"
it works, but it also matches links to images and not just articles.