4

I want to get regex for the following construct where it should result as:

Actions and Sci-Fi

<a href="/?genre=Action">Actions</a> <a href="/?genre=Sci-Fi">Sci-Fi</a>
Muhammad Muazzam
  • 2,810
  • 6
  • 33
  • 62

1 Answers1

4

Don't parse html files with regex. If you insist then you could use the below regex and get the text inside anchor tags from group index 1.

<a\s[^<>]*>([^<>]*)<\/a>

DEMO

Explanation:

<a                       '<a'
\s                       whitespace (\n, \r, \t, \f, and " ")
[^<>]*                   any character except: '<', '>' (0 or more
                         times)
>                        '>'
(                        group and capture to \1:
  [^<>]*                   any character except: '<', '>' (0 or
                           more times)
)                        end of \1
<                        '<'
\/                       '/'
a>                       'a>'
Avinash Raj
  • 172,303
  • 28
  • 230
  • 274