0

I used to use below query to convert comma delimited string to rows -

select regexp_substr('A,B,C,D','[^,]+', 1, level) from dual
connect by regexp_substr('A,B,C,D', '[^,]+', 1, level) is not null;

But, now my delimiter is - '~^'

I am not able to use same query for this delimiter.

select regexp_substr('A~^B~^C~D^E','[^~^]+', 1, level) from dual
    connect by regexp_substr('A~^B~^C~D^E', '[^~^]+', 1, level) is not null;

I expect -

Column

A
B
C~D^E

Please help

OR May be is it possible to get nth element if delimiter provided is ~^

Best Regards

Anirudh D
  • 181
  • 3
  • 14
  • 33
  • Post some sample input data please – Matt Aug 14 '17 at 10:44
  • A~^B~^C~D^E your input? – Matt Aug 14 '17 at 10:44
  • yes...that would be my input – Anirudh D Aug 14 '17 at 10:46
  • 1
    Why are you doing this in the database? I could argue that your data was not sufficiently scrubbed to warrant being brought into Oracle. – Tim Biegeleisen Aug 14 '17 at 10:47
  • This is going to be my input to procedure....In fact actually, it would be multiple SMS delimited by ~^ . Thats why I am using this special delimiter as there is high chance that comma and other characters can be used in SMS body – Anirudh D Aug 14 '17 at 10:50
  • I strongly agree with Tim, such scrubbing operations can be done easily in java/python, c etc. Data in the tables need to be stored in a way it is readable. It is not a good design of the application. Better to handle it in the SMPP – Kaushik Nayak Aug 14 '17 at 11:57

3 Answers3

1

I can't do it with regexp functions - too hard! How about this bespoke function?

create or replace function test (p_str varchar2, p_delim varchar2)
  return SYS.KU$_VCNT -- handy table of VARCHAR2(4000) in SYS
is
  l_str long := p_str;
  l_ret SYS.KU$_VCNT := SYS.KU$_VCNT();
begin
  while instr (l_str, p_delim) > 0 loop
    l_ret.extend;
    l_ret(l_ret.count) := substr(l_str, 1, instr (l_str, p_delim)-1);
    l_str := substr (l_str, instr (l_str, p_delim)+2);
  end loop;
  if l_str is not null then
    l_ret.extend;
    l_ret(l_ret.count) := l_str;
  end if;
  return l_ret;
end;

Then:

select * from table (test('A~^B~^C~D^E','~^'));

A
B
C~D^E

nth element (e.g. 2nd):

select column_value from
(select column_value, rownum n from table (test('A~^B~^C~D^E','~^')))
where n = 2;
Tony Andrews
  • 129,880
  • 21
  • 220
  • 259
1

Riffing off this method for splitting a string while allowing for nulls:

select regexp_substr('A~^B~^C~D^E','(.*?)(~\^|$)', 1, level, null, 1) from dual
connect by level < regexp_count('A~^B~^C~D^E','(.*?)(~\^|$)');

REGEXP_SUBS
-----------
A
B
C~D^E

As in the linked answer it looks for any characters, non-greedily, followed by the combination of ~^ (with the caret escaped, so ~\^) or end-of-line. The regexp_substr() calls also uses the optional arguments to specify subexpr - so it only gets the first grouping (.*?) and not the delimiter itself, which is in the second grouping.

If you want a specific element then that's even closer to the linked post:

select regexp_substr('A~^B~^C~D^E','(.*?)(~\^|$)', 1, 3, null, 1) from dual;

REGEX
-----
C~D^E

Or as you're doing this in a procedure, use the connect-by query to populate a collection, and then pick out the element you need, if you'll be looking at more than one.

Alex Poole
  • 183,384
  • 11
  • 179
  • 318
  • Thanks a lot Alex. One more question- Is it possible to just get particular element. for example say 5th element – Anirudh D Aug 14 '17 at 11:58
  • I got it filtering on level column...Thanks Alex – Anirudh D Aug 14 '17 at 12:02
  • @AnirudhD - if you only need one then only pick that one out with the regex, extracting them all and then filtering to just get one seems a bit wasteful. But maybe you mean one at a time, in which case a collection might be appropriate; or you might actually want a cursor loop. – Alex Poole Aug 14 '17 at 12:09
  • Is it possible to achive in the same query by giving nth element? – Anirudh D Aug 14 '17 at 12:11
  • What do you mean? You could do `... from dual where level = 3 connect by ...` but what would be the point? – Alex Poole Aug 14 '17 at 12:14
  • sorry my bad!..I dint check your comment for 3rd position. Thanks a lot Alex! – Anirudh D Aug 14 '17 at 12:19
0

Hmmm . . . This comes very close:

select regexp_substr('A~^B~^C~D^E', '([^~]|[~][^^])+', 1, level) from dual
    connect by regexp_substr('A~^B~^C~D^E', '([^~]|[~][^^])+', 1, level) is not null

You can get rid of the ^ by doing:

select regexp_replace(regexp_substr('A~^B~^C~D^E', '([^~]|[~][^^])+', 1, level), '^\^', '') from dual
    connect by regexp_substr('A~^B~^C~D^E', '([^~]|[~][^^])+', 1, level) is not null
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786