3

I'm using this following code in Oracle pl/sql (Version: Oracle Database 11g Release 11.2.0.1.0)

select regexp_substr('A~B~C','[^~]+',1,level) output
from dual
connect by level <= length(regexp_replace('A~B~C','[^~]+')) + 1

which gives the following results

row1: A
row2: B
row3: C

That's perfect, however should I want to give a null value, ie:

select regexp_substr('~B~C','[^~]+',1,level) output
from dual
connect by level <= length(regexp_replace('~B~C','[^~]+')) + 1

I expected and wanted the following:

row1: <null>
row2: B
row3: C

but got this output:

row1: B
row2: C
row3: null

Am I doing the pl/sql code wrong? How can I make it work right?

APC
  • 144,005
  • 19
  • 170
  • 281
Joshua
  • 173
  • 2
  • 3
  • 12

5 Answers5

4

you can combine INSTR und SUBSTR to achive the desiered result:

select  
  str, 
  replace(substr(str, 
                 case level 
                 when 1 then 0 
                 else instr( str, '~',1, level-1) 
                 end 
                  +1,
                 1
                ), '~')
from ( select 'A~B~C~D~E' as str from dual)
connect by level <= length(regexp_replace(str,'[^~]+')) + 1
;
schurik
  • 7,798
  • 2
  • 23
  • 29
  • This works well, I'm incorporating it in my pipeline function. Hopefully I'll get the results I expected. Cheers. – Joshua Feb 12 '13 at 13:37
  • Unfortunately it only works on single characters. Is there a way the code can be amended to cover words, 'How~do~I~deal~with~this'? – Joshua Feb 12 '13 at 17:58
2

You can use NULLS FIRST in the ORDER BY clause

select regexp_substr('~B~C','[^~]+',1,level) output
from dual
connect by level <= length(regexp_replace('~B~C','[^~]+')) + 1
ORDER BY regexp_substr('~B~C','[^~]+',1,level) NULLS FIRST;

To quote from Oracle documentation

If the null ordering is not specified then the handling of the null values is:

NULLS LAST if the sort is ASC

NULLS FIRST if the sort is DESC

If neither ascending nor descending order is specified, and the null ordering is also not specified, then both defaults are used and thus the order will be ascending with NULLS LAST.

Incognito
  • 2,964
  • 2
  • 27
  • 40
  • Close, but it doesn't work, because I don't want ordering. If you try using this string 'A~~C', maybe using connect isn't the right approach? – Joshua Feb 12 '13 at 10:37
  • Oh, so you want to preserve the occurrence order of the string in the input! Yes, in that case my solution will not work. – Incognito Feb 12 '13 at 10:40
  • I am not that even if you use something other than `CONNECT`, you will be able to preserve the order of your input string. Reason being, NULLs are handled in the way I mentioned in my answer above – Incognito Feb 12 '13 at 10:42
  • If I understand by your comments, NULLS will always play a part in ordering, then Connect By isn't a suitable approach? Maybe I should explore Arrays - I'm using pipeline function, so array may work? – Joshua Feb 12 '13 at 10:47
2

Because I used single characters as example in my question, but in practicality I'm using long strings in my project, for example 'How~do~I~do~this'

I came across this solution from OTN Oracle.com, thanks to chris227.

SELECT CAST(REGEXP_SUBSTR (str, '(.*?)(~|$)', 1, level, null, 1) AS CHAR(12))  output
FROM (select 'How~do~I~do~this' as str from dual)
CONNECT BY level <= regexp_count(str, '~') + 1;

This will work even with single characters.

Hope this will help others looking for similair solutions.

Joshua
  • 173
  • 2
  • 3
  • 12
0

I know that it is late now, but I think that what you (and I) needed was this:

select REPLACE(regexp_substr('A~~C','[^~]*(~)?',1,level),'~') output, level
from dual
connect by level <= length(regexp_replace('A~~C','[^~]+')) + 1
ORDER BY level;
OhBeWise
  • 5,350
  • 3
  • 32
  • 60
0

Since this problem was fresh on my mind and I happened to see this post, I respectfully submit my suggestion which builds on user3767503's answer. It reduces the number of function calls needed. It uses some 11g updates to regexp_substr() and makes use of regexp_count() which I believe was also introduced in 11g. It assumes the number of fields is the number of delimiters plus one.

select regexp_substr('AAA~X~C~~DD~~~E', '([^~]*)(~|$)', 1, level, null, 1) output, level
from dual
connect by level <= regexp_count('AAA~X~C~~DD~~~E','~') + 1 
ORDER BY level;

See this post for more info and detail on reading the regex pattern: Split comma seperated values to columns.

The bottom line is the commonly used regex pattern of '[^<delimiter>]+' to parse a string fails when there is a null in the list and should be avoided, IMHO.

Gary_W
  • 9,933
  • 1
  • 22
  • 40