1

I am trying to find all occurrences (there could be zero or more) of anchor(<a>) HTML tags with specific attributes/text (to be captured as groups). But the groups (attributes) can occur in any order.

Regex for fixed order that works fine:

<a\s+.*attr1="myattr".*attr2="(.+)".*attr3="(.+)".*>(.+)</a>

Tried the following regex for different order without success:

<a\s+.*?((attr1="myattr".*?attr2="(.+?)".*?attr3="(.+?)")|(attr1="myattr".*?attr3="(.+?)".*?attr2="(.+?)")|(attr2="(.+?)".*?attr3="(.+?)".*?attr1="myattr")|(attr2="(.+?)".*?attr1="myattr".*?attr3="(.+?)")|(attr3="(.+?)".*?attr2="(.+?)".*?attr1="myattr")|(attr3="(.+?)".*?attr1="myattr".*?attr2="(.+?)")).*?>(.+?)</a>

Input String for different order of attributes:

First <a attr1="myattr" attr2="value12" attr3="value13">text1</a>Second <a attr1="myattr" attr3="value13" attr2="value12">text2</a> Third <a attr2="value12" attr1="myattr" attr3="value13">text3</a>`
Ambrish
  • 3,627
  • 2
  • 27
  • 42
  • 6
    What **language**? Don't use regex, use a parser for this. – hwnd Mar 29 '15 at 03:04
  • possible duplicate of [RegEx match open tags except XHTML self-contained tags](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) – Nir Alfasi Mar 29 '15 at 03:13

1 Answers1

0

Use lookahead instead of normal matching (but in jeneral you shouldn't use regex for parsing html):

/<a\s+(?=[^>]*\battr1\s*=\s*"myattr")(?=[^>]*\battr2\s*=\s*"([^"]+?)")(?=[^>]*\battr3\s*=\s*"([^"]+?)")[^>]*>(.+?)<\/a>/

`
  <a attr1="myattr" attr2="smth" attr3="3">123</a>
  <a attr1="myattr" attr3="3" attr2="smth">132</a>
  <a attr2="smth" attr1="myattr" attr3="3">213</a>
  <a attr2="smth" attr3="3" attr1="myattr">231</a>
  <a attr3="3" attr1="myattr" attr2="smth">312</a>
  <a attr3="3" attr2="smth" attr1="myattr">321</a>
`.replace(
  /<a\s+(?=[^>]*\battr1\s*=\s*"myattr")(?=[^>]*\battr2\s*=\s*"([^"]+?)")(?=[^>]*\battr3\s*=\s*"([^"]+?)")[^>]*>(.+?)<\/a>/g,
  (match, attr2, attr3, text) => console.log(text, attr2, attr3, match)
)
Qwertiy
  • 19,681
  • 15
  • 61
  • 128