8

I want to be able to split csv strings in Oracle 9i

I've read the following article http://www.oappssurd.com/2009/03/string-split-in-oracle.html

But I didn't understand how to make this work. Here are some of my questions pertaining to it

  1. Would this work in Oracle 9i, if not, why not?
  2. Is there a better way of going about splitting csv strings then the solution presented above?
  3. Do I need to create a new type? If so, do I need specific privilages for that?
  4. Can I declare the type w/in the function?
APC
  • 144,005
  • 19
  • 170
  • 281
Joyce
  • 1,431
  • 2
  • 18
  • 33

5 Answers5

17

Joyce,

Here are three examples:

1) Using dbms_utility.comma_to_table. This is not a general purpose routine, because the elements should be valid identifiers. With some dirty tricks we can make it work more universal:

SQL> declare
  2    cn_non_occuring_prefix constant varchar2(4) := 'zzzz';
  3    mystring varchar2(2000):='a:sd:dfg:31456:dasd: :sdfsdf'; -- just an example
  4    l_tablen binary_integer;
  5    l_tab    dbms_utility.uncl_array;
  6  begin
  7    dbms_utility.comma_to_table
  8    ( list   => cn_non_occuring_prefix || replace(mystring,':',','||cn_non_occuring_prefix)
  9    , tablen => l_tablen
 10    , tab    => l_tab
 11    );
 12    for i in 1..l_tablen
 13    loop
 14      dbms_output.put_line(substr(l_tab(i),1+length(cn_non_occuring_prefix)));
 15    end loop;
 16  end;
 17  /
a
sd
dfg
31456
dasd

sdfsdf

PL/SQL-procedure is geslaagd.

2) Using SQL's connect by level. If you are on 10g or higher you can use the connect-by-level approach in combination with regular expressions, like this:

SQL> declare
  2    mystring varchar2(2000):='a:sd:dfg:31456:dasd: :sdfsdf'; -- just an example
  3  begin
  4    for r in
  5    ( select regexp_substr(mystring,'[^:]+',1,level) element
  6        from dual
  7     connect by level <= length(regexp_replace(mystring,'[^:]+')) + 1
  8    )
  9    loop
 10      dbms_output.put_line(r.element);
 11    end loop;
 12  end;
 13  /
a
sd
dfg
31456
dasd

sdfsdf

PL/SQL-procedure is geslaagd.

3) Again using SQL's connect by level, but now in combination with good old SUBSTR/INSTR in case you are on version 9, like you are:

    SQL> declare
      2    mystring varchar2(2000):='a:sd:dfg:31456:dasd: :sdfsdf'; -- just an example
      3  begin
      4    for r in
      5    ( select substr
      6             ( str
      7             , instr(str,':',1,level) + 1
      8             , instr(str,':',1,level+1) - instr(str,':',1,level) - 1
      9             ) element
     10        from (select ':' || mystring || ':' str from dual)
     11     connect by level <= length(str) - length(replace(str,':')) - 1
     12    )
     13    loop
     14      dbms_output.put_line(r.element);
     15    end loop;
     16  end;
     17  /
    a
    sd
    dfg
    31456
    dasd

    sdfsdf

PL/SQL-procedure is geslaagd.

You can see some more techniques like these, in this blogpost: http://rwijk.blogspot.com/2007/11/interval-based-row-generation.html

Hope this helps.

Regards, Rob.


To address your comment:

An example of inserting the separated values into a normalized table.

First create the tables:

SQL> create table csv_table (col)
  2  as
  3  select 'a,sd,dfg,31456,dasd,,sdfsdf' from dual union all
  4  select 'a,bb,ccc,dddd' from dual union all
  5  select 'zz,yy,' from dual
  6  /

Table created.

SQL> create table normalized_table (value varchar2(10))
  2  /

Table created.

Because you seem interested in the dbms_utility.comma_to_table approach, I mention it here. However, I certainly do not recommend this variant, because of the identifier quirks and because of the slow row by row processing.

SQL> declare
  2    cn_non_occuring_prefix constant varchar2(4) := 'zzzz';
  3    l_tablen binary_integer;
  4    l_tab    dbms_utility.uncl_array;
  5  begin
  6    for r in (select col from csv_table)
  7    loop
  8      dbms_utility.comma_to_table
  9      ( list   => cn_non_occuring_prefix || replace(r.col,',',','||cn_non_occuring_prefix)
 10      , tablen => l_tablen
 11      , tab    => l_tab
 12      );
 13      forall i in 1..l_tablen
 14        insert into normalized_table (value)
 15        values (substr(l_tab(i),length(cn_non_occuring_prefix)+1))
 16      ;
 17    end loop;
 18  end;
 19  /

PL/SQL procedure successfully completed.

SQL> select * from normalized_table
  2  /

VALUE
----------
a
sd
dfg
31456
dasd

sdfsdf
a
bb
ccc
dddd
zz
yy


14 rows selected.

I do recommend this single SQL variant:

SQL> truncate table normalized_table
  2  /

Table truncated.

SQL> insert into normalized_table (value)
  2   select substr
  3          ( col
  4          , instr(col,',',1,l) + 1
  5          , instr(col,',',1,l+1) - instr(col,',',1,l) - 1
  6          )
  7     from ( select ',' || col || ',' col from csv_table )
  8        , ( select level l from dual connect by level <= 100 )
  9    where l <= length(col) - length(replace(col,',')) - 1
 10  /

14 rows created.

SQL> select * from normalized_table
  2  /

VALUE
----------
a
a
zz
sd
bb
yy
dfg
ccc

31456
dddd
dasd

sdfsdf

14 rows selected.

Regards, Rob.

Rob van Wijk
  • 17,555
  • 5
  • 39
  • 55
  • I think using dbms_utility.comma_to_table is good. This is slightly off topic. How would I make this run over a column which is full of these csv values and insert them all into a new table? Sorry, I'm very very new to Oracle. Appreciate it! Joyce – Joyce Jul 07 '09 at 16:25
  • I added a section to the answer to address your comment. – Rob van Wijk Jul 08 '09 at 09:00
  • Hi Rob, the solution # 2 [Using SQL's connect by level] works great for me.But now,I've a requirement of passing 2 CSV strings[equal in length] as input to PL/SQL stored proc.And, I need to insert values from these two CSV strings in two different columns in the table.Could you please let me know how to go about it? – Jimmy Oct 26 '10 at 13:14
  • Number 2 is great – Spongebob Comrade Oct 31 '18 at 23:42
8

Here's a string tokenizer for Oracle that's a little more straightforward than that page, but no idea if it's as fast:

create or replace function splitter_count(str in varchar2, delim in char) return int as
val int;
begin
  val := length(replace(str, delim, delim || ' '));
  return val - length(str); 
end;

create type token_list is varray(100) of varchar2(200);

CREATE or replace function tokenize (str varchar2, delim char) return token_list as
ret token_list;
target int;
i int;
this_delim int;
last_delim int;
BEGIN
  ret := token_list();
  i := 1;
  last_delim := 0;
  target := splitter_count(str, delim);
  while i <= target
  loop
    ret.extend();
    this_delim := instr(str, delim, 1, i);
    ret(i):= substr(str, last_delim + 1, this_delim - last_delim -1);
    i := i + 1;
    last_delim := this_delim;
  end loop;
  ret.extend();
  ret(i):= substr(str, last_delim + 1);
  return ret;
end;

You can use it like this:

select tokenize('hi you person', ' ') from dual;
VARCHAR(hi,you,person)
Michael Sofaer
  • 2,927
  • 24
  • 18
  • 1
    This is a very good method, still applicable even upto 11g at least. Such method is fast because you do no restart your parsing every time from position zero, but continue parsing from where you left off. I have developed some code handling it a little different, posted on [my blog on Parsing a string with a CSV into multiple columns](http://hiflitetm.wordpress.com/2013/11/04/parsing-a-string-with-a-csv-into-multiple-columns/). It is somewhat similar, but I went the route with a pipelined function, and an odd way of using a cartesian join. I do have results in individual columns. – YoYo Nov 24 '13 at 07:32
2

It sounds like you don't want to add schema (types, function). One SQL only way to parse the delimited text is to 'go crazy' with instr and substr calls.

    DECLARE
      V_CSV_STRING VARCHAR2(100);
    BEGIN
      --Create a test delimited list of first_name, last_name, middle_init
      V_CSV_STRING := 'Brian,Hart,M';

    select substr( V_CSV_STRING||',', 1, instr(V_CSV_STRING,',')-1 ) FIRST_NAME,
           substr( V_CSV_STRING||',,', instr( V_CSV_STRING||',,', ',') +1, 
                             instr( V_CSV_STRING||',,', ',', 1, 2 )-instr(V_CSV_STRING||',,',',')-1 ) LAST_NAME,
           rtrim(substr( V_CSV_STRING||',,', instr( V_CSV_STRING||',,',',',1,2)+1),',') MIDDLE_INIT
     from dual;
     END;

If your looking to formalize a structure and adding the appropriate application code (functions, views, types etc...) I would take a look at Tom Kyte's writing on this subject.

Brian
  • 13,412
  • 10
  • 56
  • 82
  • Yes, I've read his article http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:2189860818012 – Joyce Jul 07 '09 at 12:49
2

You might want to be a bit clearer on what you want to do, then we can give you a specific answer. Showing some of your code is always helpful :)

If you are using paramters, to split a string of csv numbers (eg: 1,2,3,4) then use that in a IN statement have a look at the function str2tbl() in Question 670922. With a few changes you could change it to a VARCHAR2 or whatever you need.

In the following you could set :sMyCatagories equal to '1,2,3,4'

create or replace type myTableType as table of number;

create or replace function str2tbl( p_str in varchar2 ) return myTableType
  as
     l_str   long default p_str || ',';
     l_n        number;
     l_data    myTableType := myTabletype();
  begin
      loop
          l_n := instr( l_str, ',' );
          exit when (nvl(l_n,0) = 0);
          l_data.extend;
          l_data( l_data.count ) := ltrim(rtrim(substr(l_str,1,l_n-1)));
          l_str := substr( l_str, l_n+1 );
      end loop;
      return l_data;
  end;

and using it in a select statement....

SELECT 
  *
FROM
  atable a 
WHERE 
  a.category in (
        select * from INLIST (
           select cast(str2tbl(:sMyCatagories) as mytableType) from dual
        ) 
  );

This is really only useful if you are using parameters. If you are munging together SQL in your application, then just use a normal IN statement.

SELECT 
  *
FROM
  atable a 
WHERE 
  a.category in (1,2,3,4);
Community
  • 1
  • 1
Mark Nold
  • 5,638
  • 7
  • 31
  • 33
  • Basically, I just wanted to unpivot csv data, and want to see if I can do it without needing to create a new type, since I don't have the privilege. – Joyce Jul 07 '09 at 12:52
2

I used this in the end

create or replace function split
(
   p_list varchar2

) return sys.dbms_debug_vc2coll pipelined
is
   l_idx    pls_integer;
   l_list    varchar2(32767) := p_list;
   l_value    varchar2(32767);
begin
   loop
       l_idx := instr(l_list,',');
       if l_idx > 0 then
           pipe row(substr(l_list,1,l_idx-1));
           l_list := substr(l_list,l_idx+length(','));

       else
           pipe row(l_list);
           exit;
       end if;
   end loop;
   return;
end split;

declare
CURSOR c IS  select occurrence_num, graphics from supp where graphics is not null and graphics not like ' %';
begin
  FOR r IN c LOOP   
      insert into image (photo_id,report_id, filename) 
      select image_key_seq.nextval   photo_id, r.occurrence_num report_id, 
      t.column_value  filename from table(split(cast(r.graphics as varchar2(1000)))) t where t.column_value is not null;
   END LOOP;  
end ;
Joyce
  • 1,431
  • 2
  • 18
  • 33