0

I want to split a string with (ex:|) delimter. Here, my problem is string contains with escape characters. How i need to delete the delimiter within esacape character.

The data looks like:

null|123456|xxx12345|123|-11234|123|2000-01-01|XXX|01|0.000000000000|0.000000000000|0.000000000000|"AAA |AAA Data Group (AAA Inc)"|null|2000-01-01|null|null|xx

val delimit='|'
val inputData = 'null|123456|xxx12345|123|-11234|123|2000-01-01|XXX|01|0.000000000000|0.000000000000|0.000000000000|"AAA |AAA Data Group (AAA Inc)"|null|-|2000-01-01|-|null|null|xx'
inputData.split(delimit).map(x=>{println(x)})

I expected the result:

null
123456
xxx12345
123
-11234
123
2000-01-01
XXX
01
0.000000000000
0.000000000000
0.000000000000
"AAA AAA Data Group (AAA Inc)"
null
2000-01-01
null
null
xx

but the actual output is:\n

null
123456
xxx12345
123
-11234
123
2000-01-01
XXX
01
0.000000000000
0.000000000000
0.000000000000
"AAA 
AAA Data Group (AAA Inc)"
null
2000-01-01
null
null
xx
Krzysztof Atłasik
  • 21,985
  • 6
  • 54
  • 76

1 Answers1

0

split accepts regex, so you could use right regex to split only if | is not inside quotes:

val inputData = """null|123456|xxx12345|123|-11234|123|2000-01-01|XXX|01|0.000000000000|0.000000000000|0.000000000000|"AAA |AAA Data Group (AAA Inc)"|null|-|2000-01-01|-|null|null|xx"""

inputData.split("\|(?=([^"]*"[^"]*")*[^"]*$)")

// Array(null, 123456, xxx12345, 123, -11234, 123, 2000-01-01, XXX, 01, 0.000000000000, 0.000000000000, 0.000000000000, "AAA |AAA Data Group (AAA Inc)", null, -, 2000-01-01, -, null, null, xx)

I've borrowed regex from this question.

Krzysztof Atłasik
  • 21,985
  • 6
  • 54
  • 76