0

Below is my sample text .

<ul>
<li><a href="www.google.com" target="blank">Google</a></li>
<li><a href="www.yahoo.com" target="blank">Yahoo</a></li>
<li><a href="www.bing.com" target="blank">Bing</a></li>
</ul>

I would like to add an extra attribute in anchor tag with the value of hyperlink like below.

<ul>
<li><a href="www.google.com" target="blank" aria-label="Google">Google</a></li>
<li><a href="www.yahoo.com" target="blank" aria-label="Yahoo">Yahoo</a></li>
<li><a href="www.bing.com" target="blank" aria-label="Bing">Bing</a></li>
</ul>

I want to do this using notepad++ regular expression. Appreciate your help !!

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
Raj Kumar
  • 21
  • 1
  • Then there's the famous "don't parse html with regex" thread. http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 – Paulb May 09 '17 at 09:59
  • @Paulb, yep, but in notepad++ there is not much else available. – trincot May 09 '17 at 10:01
  • @trincot, you're right. I just cringe when I see people use the wrong tool. I suspect that OP wants notepad++ regex, because he is working a very large html file(s). and a large html file is a perfect breeding ground for regex not to work in. My life became much easier when I learned xslt for these situations. – Paulb May 09 '17 at 10:10
  • @Paulb: *"a large html file is a perfect breeding ground for regex not to work in"*: It's a wrong *(and common)* idea. When this appends, that means the pattern is badly designed. – Casimir et Hippolyte May 09 '17 at 11:03

3 Answers3

2

You can use this regular expression find/replace:

Find: >([^<>]+)</a>

Replace:  aria-label="$1"$0

Transforming Quotes

In comments you asked to also replace a single quote by a repeated single quote, in both the texts. This cannot be done in the same replace operation, but you could launch a separate one, that should be executed before the one above:

Find: '(?=[^<>]*</a>)

Replace: ''

And then after this is done, you could apply the first replace operation.

trincot
  • 317,000
  • 35
  • 244
  • 286
  • Add started spase to the replace template – Alexander May 09 '17 at 10:20
  • I typed the space, but apparently the code formatting does not render it. Ah, I managed to add a non-breaking space instead. Anyway, you got it ;-) – trincot May 09 '17 at 11:02
  • Hi trincot, I need to replace single quotes ' with double quotes '' only in hyperlink text. how can i update this one. – Raj Kumar May 10 '17 at 08:04
  • You'd need a second regex for that -- it is not possible to do both operations in one. So you want those quotes replaced only in the hyperlink text content, or also in the newly created `aria-label` attribute? – trincot May 10 '17 at 08:28
  • Yes, I want to replace both the places.
  • Google's
  • the above should be display as
  • Google''s
  • – Raj Kumar May 10 '17 at 09:14
  • Well that would be invalid HTML, I suppose you want the attribute value to be wrapped in single quotes then? Or else the attribute value should be HTML encoded: `Google"s`. – trincot May 10 '17 at 09:25
  • it is just sample. Need to update single quote with double quote in text. not html enocded. – Raj Kumar May 10 '17 at 09:31
  • Wait, now I see that you actually want two single quotes next to eachother, not a double quote. Did I get that right? – trincot May 10 '17 at 09:50
  • yeah, two single quotes. not double quote – Raj Kumar May 10 '17 at 11:37