2

I need a JavaScript RegEx to split a string by semicolon outside single or double quotes.

Actually i'm using the following regex /(?!\B['"][^'"]*);(?![^'"]*['"]\B)/gm that sadly doesn't cover every case.

What i need:

const string = `Lorem ipsum; "dolor sit; amet"; consectetur 'adipiscing; elit.' Fusce; sit amet ligula.; Phasellus in laoreet quam.`;

const resultArr = string.split(/THEREGEX/gm);

console.log(resultArr);
// ["Lorem ipsum", "\"dolor sit; amet\"", " consectetur 'adipiscing; elit.' Fusce", "sit amet ligula.", " Phasellus in laoreet quam."]
adiga
  • 34,372
  • 9
  • 61
  • 83
  • I suggest not using a regex and just doing it with a loop with a stack to keep track of quotes. – SuperStormer Apr 29 '21 at 17:21
  • 3
    I have reopened because you mentioned single or double quotes. Similar questions: [Splitting on comma outside quotes](https://stackoverflow.com/q/18893390) and [Split a string by commas but ignore commas within double-quotes using Javascript](https://stackoverflow.com/q/11456850) and [Split string by comma, but ignore commas inside quotes](https://stackoverflow.com/q/23582276) – adiga Apr 29 '21 at 17:29

1 Answers1

6

You may use this regex:

((?:[^;'"]*(?:"(?:\\.|[^"])*"|'(?:\\.|[^'])*')[^;'"]*)+)|;

RegEx Demo

Code:

const s = `Lorem ipsum; "dolor sit; amet"; consectetur 'adipiscing; elit.' Fusce; sit amet ligula.; Phasellus in laoreet quam.`
const re = /((?:[^;'"]*(?:"(?:\\.|[^"])*"|'(?:\\.|[^'])*')[^;'"]*)+)|;/

console.log( s.split(re).filter(Boolean) )

RegEx Details:

  • (: Start capture group #1
    • [^;'"]*: Match 0 or more any character that are not ' and " and not ;
    • (?:: Start non-capture group
      • "(?:\\.|[^"])*": Match a double quoted substring ignoring all escaped quotes
      • |: OR
      • '(?:\\.|[^'])*': Match a single quoted substring ignoring all escaped quotes
    • ): End non-capture group
    • [^;'"]*: Match 0 or more any character that are not ' and " and not ;
  • ): End capture group #1
  • |: OR
  • ;: Match a ;
  • .filter(Boolean): is used to remove empty results from split array
anubhava
  • 761,203
  • 64
  • 569
  • 643