3

I'd like to get the single strings of a pipe-separated string, with "pipe escaping" support, e.g.:

fielda|field b |field\|with\|pipe\|inside

would get me:

array("fielda", "field b ", "field|with|pipe|inside")

How would I reach that goal with a regular expression?

Alex Schenkel
  • 728
  • 1
  • 7
  • 13
  • 1
    What language are you using? Chances are that the language already supports something like this for you with a CSV parsing function or module. – Andy Lester Jan 28 '15 at 21:09

4 Answers4

7
Split  by this (?<!\\)\|

See demo.The lookbehind makes sure | is not after \.

https://regex101.com/r/pM9yO9/15

vks
  • 67,027
  • 10
  • 91
  • 124
1

This should work too:

((?:[^\\|]+|\\\|?)+)

The regex will capture everything up to a single | (including \|)

DEMO

Enissay
  • 4,969
  • 3
  • 29
  • 56
0

An other way with php, using strtr that replace \| with a placeholder:

$str = 'field a|field b|field\|with\|pipe\|inside';
$str = strtr($str, array('\|' => '#'));
$result = array_map(function ($i) {
    return strtr($i, '#', '|');
}, explode('|', $str));
Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125
0

You may match these items with

(?:[^\\|]|\\[\s\S])+

See the regex demo.

NOTE: If you need to split on a character other than | just replace it within the first negated character class, [^\\|]. Just note that you will have to escape ] (unless it is placed right after [^) and - (if not at the start/end of the class).

The [\s\S] can be replaced with a . in many engines and pass the appropriate option to make it match across line endings.

See the regex graph:

enter image description here

JS demo:

console.log(
   "fielda|field b |field\\|with\\|pipe\\|inside".match(/(?:[^\\|]|\\[\s\S])+/g)
)
// =>  ["fielda", "field b ", "field\|with\|pipe\|inside"]
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563