0

I have a string to replace semi-colons with \n. The requirement I have is to detect only those semi-colons that are outside HTML <> tags and replace them with \n.

I have come very close by using this regex by implementing multiple fixes.

/((?:^|>)[^<>]*);([^<>]*(?:<|$))/g, '$1\n$2'

The above regex works well if I input string like the below one - Value1;<p style="color:red; font-weight:400;">Value2</p>;<p style="color:red; font-weight:400;">Value3</p>;Value4

The output it gives is this (which is expected and correct) -
Value1
<p style="color:red; font-weight:400;">Value2</p>
<p style="color:red; font-weight:400;">Value3</p>
Value4

But fails if I input string like - M1;M2;M3

The output this gives is -
M1;M2
M3

(semi-colon doesn't remove between M1 and M2).

whereas the expected output should be -

M1
M2
M3

Also the string can be like this too (both combined) - M1;M2;M3;Value1;<p style="color:red; font-weight:400;">Value2</p>;<p style="color:red; font-weight:400;">Value3</p>;Value4

The major goal is to replace all the semicolons outside HTML Tags <> and replace it with '\n` (enter key).

1 Answers1

0

You can use this regex associate with .replace() function of JavaScript:

/(<[^<>]*>)|;/g

For substitution, you may use this function:

(_, tag) => tag || '\n'

If (<[^<>]*>) catches anything - which is a HTML tag, it will go into tag parameter, otherwise an outbound ; must be matched.

So you can check if tag exists. If it exists, replace with itself, otherwise replace it with a \n.

const text = `Value1;<p style="color:red; font-weight:400;">Value2</p>;<p style="color:red; font-weight:400;">Value3</p>;Value4

M1;M2;M3`;

const regex = /(<[^<>]*>)|;/g;

const result = text.replace(regex, (_, tag) => tag || '\n');

console.log(result);
Hao Wu
  • 17,573
  • 6
  • 28
  • 60