0

I am using Notepad++, wherein I have to find and remove the immediate duplicate HTML tag which is shown below

Actual

<a href="www.google.com"><a href="www.google.com">www.google.com</a></a>

Required

<a href="www.google.com">www.google.com</a>

I have a regex to find duplicates which comes in new line, but my search will be with in a line.

Pl help me

Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125
Venkat
  • 5

4 Answers4

2

Find:

(<(\w+)(\s[^>]*)?>)\1(.*)(<\/\2>)\5

Replace:

\1\4\5

Tested in Sublime.

Albert Xing
  • 5,620
  • 3
  • 21
  • 40
1

For this kind of "double links" you can use this:

find: <(a [^>]+)>(<\1>.*?</a>)</a>
replace: \2

For all tags use:

find: <((\w+)[^>]*)>(<\1>.*?</\2>)</\2>
replace: \3

(the two with a recent version of notepad++)

Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125
1

Search Pattern:

.*">(<.*>)<\/a>

Replace:

\1
Dick Faps
  • 135
  • 1
  • 8
0

Try this pattern

(<(\w+)(\s[^>]*)?>)(\s|\n|\t)*\1(.*)(<\/\2>)(\s|\n|\t)*\6

Demo:http://rubular.com/r/RT7ObfV0i8

replace \1 and \6

Civa
  • 2,058
  • 2
  • 18
  • 30
  • This doesn't keep the data between the tags in Notepad++. It does get rid of the duplicate tag though. Should say replace \1\5\6. – AbsoluteƵERØ May 06 '13 at 06:20