Given that the OP specified the following in the comments below the question, regex may be used. Be careful though as regex can easily break when trying to parse HTML.
I used BS4, but my boss asked me to use regex because BS4 is an overkill to extract a simple link
See regex in use here
<a\b(?=[^>]* class="[^"]*(?<=[" ])active[" ])(?=[^>]* href="([^"]*))
<a
Match this literally
\b
Assert position as a word boundary
(?=[^>]* class="[^"]*(?<=[" ])active[" ])
Positive lookahead ensuring the following is matched.
[^>]*
Match any character except >
any number of times
class="
Match this literally
[^"]*
Match any character except "
any number of times
(?<=[" ])
Positive lookbehind ensuring what precedes is a character in the set
active
Match this literally
[" ]
Match either character in the set
(?=[^>]* href="([^"]*))
Positive lookahead ensuring what follows matches
[^>]*
Match any character except >
any number of times
href="
Match this literally
([^"]*)
Capture any character except "
any number of times into capture group 1
Given the following samples, only the first 3 are matched:
<a class="active" href="something">
<a href="something" class="active">
<a href="something" class="another-class active some-other-class">
<a class="inactive" href="something">
<a not-class="active" href="something">
<a class="active" not-href="something">