1

I want to convert SGML to XML with regrex. Like:

convert:

<a><ab><abc>111<abc2>222</ab></a>

to:

<a><ab><abc>111</abc><abc2>222</abc2></ab></a>

And I write the following code to do the conversion:

String a = "<a><ab><abc>abc<abc2>abc2</ab></a>";
a = a.replaceAll("<([^<>]+?)>([^<>]+?)<(?!/\\$1>)", "<$1>$2</$1><");
System.out.println(a);

However the result is not the expected one:

<a><ab><abc>111</abc><abc2>222</ab></a>

My question, is it possible to do the conversion with regex? If yes, What's the issue in my code?

Sid Zhang
  • 972
  • 3
  • 9
  • 18

1 Answers1

2

Use the below regex

<(([^<>]+?)>)([^<>]+?)(?=<(?!\1))

And then replace the match with

<$1$3</$2>

https://regex101.com/r/cD1nC8/1

String s = "<a><ab><abc>111<abc2>222</ab></a>";
System.out.println(s.replaceAll("<(([^<>]+?)>)([^<>]+?)(?=<(?!\\1))", "<$1$3</$2>"));

Output:

<a><ab><abc>111</abc><abc2>222</abc2></ab></a>
Avinash Raj
  • 172,303
  • 28
  • 230
  • 274