Regex to get "Configuración de clientes" from this HTML

Question

I have this HTML but in my actual regex I can't get only the last ocurrence with the words "Configuración de clientes". How can I do that?

Regex ">(.*?)<\/a>

<p><em class="featured-box-primary fa fa-check"></em><a href="/Administrativo/Generales/Clientes/General">Configuración de clientes</a></p>

Possible duplicate of [RegEx match open tags except XHTML self-contained tags](https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) — Corion, Nov 20 '18 at 14:30
I don't understand what you want to do. Do you want to catch all ocurence of link ? — charles Lgn, Nov 20 '18 at 14:31
I want to catch only the part with "Configuración de clientes" but my regex gets various coincidences. — David López, Nov 20 '18 at 14:34

score 2 · Accepted Answer · answered Nov 20 '18 at 14:33

You should not parse complete html pages via regex - html is more then regex can handle. Use a xml/xhtml-parser instead.

For small snippets regex can work:

The . will match > as well - thats why your match is that big. Instead you can use

">([^>]*?)<\/a>

which boils down to

"         literal
>         literal
(         start of grouping  
 [^>]*?      as few as possible characters that are not a literal >
)         end of grouping

Regex to get "Configuración de clientes" from this HTML

1 Answers1