how to properly split a tab delimited value into individual columns in ORACLE

Question

I'm working with this query:

SELECT
    REGEXP_SUBSTR ('test1   test2   test3', '[^\t]+', 1, 1) field1,
    REGEXP_SUBSTR ('test1   test2   test3', '[^\t]+', 1, 2) field2,
    REGEXP_SUBSTR ('test1   test2   test3', '[^\t]+', 1, 3) field3
FROM DUAL

I'm looking for each field to have each value. this is a tab-delimited string. But I'm getting random outputs to it.

Looking for:

field1    field2    field3
test1     test2     test3

Getting:

field1    field2    field3
es        es        es

what am I missing?

thanks

By the way and just FYI, the Posix pattern `'[^[:space:]]+'` works for both tabs and spaces(even carriage return, newline etc). — Kaushik Nayak, Mar 22 '19 at 17:26

score 4 · Accepted Answer · answered Mar 22 '19 at 17:17

[^\t]+ will match characters that are not a backslash and not a t character.

So

SELECT
    REGEXP_SUBSTR ('aaatbbb\ccc', '[^\t]+', 1, 1) field1,
    REGEXP_SUBSTR ('aaatbbb\ccc', '[^\t]+', 1, 2) field2,
    REGEXP_SUBSTR ('aaatbbb\ccc', '[^\t]+', 1, 3) field3
FROM DUAL

Will output:

FIELD1 | FIELD2 | FIELD3
:----- | :----- | :-----
aaa    | bbb    | ccc

Having split the string on \ and t

If you want to split it on a tab character then:

SELECT
    REGEXP_SUBSTR ('test1' || CHR(9) || 'test2' || CHR(9) || 'test3', '[^' || CHR(9) || ']+', 1, 1) field1,
    REGEXP_SUBSTR ('test1' || CHR(9) || 'test2' || CHR(9) || 'test3', '[^' || CHR(9) || ']+', 1, 2) field2,
    REGEXP_SUBSTR ('test1' || CHR(9) || 'test2' || CHR(9) || 'test3', '[^' || CHR(9) || ']+', 1, 3) field3
FROM DUAL

Which outputs:

FIELD1 | FIELD2 | FIELD3
:----- | :----- | :-----
test1  | test2  | test3

db<>fiddle here

Alex Poole · Answer 2 · 2019-03-22T17:37:15.780

Characters inside the square brackets are interpreted as literal characters, not escape characters - so you're matching anything except 't' or '\' rather than tabs.

You could embed an actual tab character using concatenation:

SELECT
    REGEXP_SUBSTR ('test1   test2   test3', '[^'||chr(9)||']+', 1, 1) field1,
    REGEXP_SUBSTR ('test1   test2   test3', '[^'||chr(9)||']+', 1, 2) field2,
    REGEXP_SUBSTR ('test1   test2   test3', '[^'||chr(9)||']+', 1, 3) field3
FROM DUAL;

FIELD FIELD FIELD
----- ----- -----
test1 test2 test3

You may have problems with adjacent tabs though:

SELECT
    REGEXP_SUBSTR ('test1       test3', '[^'||chr(9)||']+', 1, 1) field1,
    REGEXP_SUBSTR ('test1       test3', '[^'||chr(9)||']+', 1, 2) field2,
    REGEXP_SUBSTR ('test1       test3', '[^'||chr(9)||']+', 1, 3) field3
FROM DUAL;

FIELD FIELD FIELD3
----- ----- ------
test1 test3

A safer pattern is:

SELECT
    REGEXP_SUBSTR ('test1       test3', '(.*?)('||chr(9)||'|$)', 1, 1, null, 1) field1,
    REGEXP_SUBSTR ('test1       test3', '(.*?)('||chr(9)||'|$)', 1, 2, null, 1) field2,
    REGEXP_SUBSTR ('test1       test3', '(.*?)('||chr(9)||'|$)', 1, 3, null, 1) field3
FROM DUAL;

FIELD FIELD2 FIELD
----- ------ -----
test1        test3

db<>fiddle

Sorry Alex. Great answer but MT0 said it first. I'll give you a +1 tho. — arcee123, Mar 22 '19 at 17:19
I'm used to MTO beating me *8-) I've added a note about adjacent tabs (empty elements, I suppose), which may or may not be relevant to you. — Alex Poole, Mar 22 '19 at 17:26

how to properly split a tab delimited value into individual columns in ORACLE

2 Answers2