0

I've got a table with 2 fields (ID & Profile).

Table has lots of records, but is unique by ID.
Profile is 1000 characters long, and each 10 characters is a basically a Profile code.
Some ID's may have just 1 code, whilst others have many.

I don't know why the table was designed like this.

So a record might look something like:

ID              PROFILE
BOB             BM        BS        DM        FM        IC        IC6       IL        IM        JN101     MM        XC        XM        XR
BILL            ZZ        XY               

Is there a way to split the Profile field, making new rows

So it would look like:

ID              PROFILE
BOB             BM
BOB             BS
BOB             DM
...
BILL            ZZ
BILL            XY

I'd really like to be able to create a View somehow.

Mark Barinstein
  • 11,456
  • 2
  • 8
  • 16
Anson
  • 27
  • 1
  • 6
  • Platform and version of Db2 are important to note. With Db2 for IBM i v7.3 and higher, IBM has included a `SPLIT()` [function](https://www.ibm.com/support/knowledgecenter/ssw_ibm_i_74/rzajq/rzajqudfsplit.htm) for you... – Charles Feb 24 '20 at 17:58
  • We're using Db2 for IBM i v7.3 - I'll look into SPLIT(). Thanks. – Anson Feb 24 '20 at 21:25

3 Answers3

4

Try this:

DB2 for LUW

/*
WITH TAB (ID, PROFILE) AS 
(
VALUES 
  ('BOB',  'BM        BS        DM        FM        IC        IC6       IL        IM        JN101     MM        XC        XM        XR')
, ('BILL', 'ZZ        XY')
)
*/
SELECT T.ID, V.TOK
FROM TAB T
, xmltable
(
'for $id in tokenize($s, " +") return <i>{string($id)}</i>' 
passing T.PROFILE as "s" 
columns 
  tok varchar(10) path '.'
) V;

dbfiddle example.

DB2 for LUW & IBM i

IBM i doesn't support (at least up to 7.4) FLWOR XQuery expressions, so you need to construct an XML document.

SELECT T.ID, V.TOK
FROM 
  TAB T
, XMLTABLE 
(
'$doc/d/i'
PASSING XMLPARSE(DOCUMENT '<d><i>' || regexp_replace(trim(T.PROFILE), '  +', '</i><i>') || '</i></d>') as "doc"
COLUMNS 
  TOK VARCHAR(10) PATH '.'
) V;
Community
  • 1
  • 1
Mark Barinstein
  • 11,456
  • 2
  • 8
  • 16
1

Tried using split(). It seems very slow. Below takes 14.817 sec.

WITH TAB (ID, PROFILE) AS 
(
VALUES 
  ('BOB',  cast('BM        BS        DM        FM        IC        IC6       IL        IM        JN101     MM        XC        XM        XR' as char(1000))), 
  ('BILL', cast('ZZ        XY' as char(1000)))
)
SELECT a.id, b.element as profile
from  tab a, 
      TABLE (systools.split(a.profile , ' ')) b
where b.element <> '';

Compared with Joachim's answer taking only 134 ms!

WITH TAB (ID, PROFILE) AS 
(
VALUES 
  ('BOB',  cast('BM        BS        DM        FM        IC        IC6       IL        IM        JN101     MM        XC        XM        XR' as char(1000))), 
  ('BILL', cast('ZZ        XY' as char(1000)))
)
SELECT a.id, b.item as PROFILE
  from tab a
      ,XMLTABLE('$doc/items/item'
          PASSING XMLPARSE(DOCUMENT CAST(   '<items><item>'
                                         || replace(a.profile , ' ' , '</item><item>')
                                         || '</item></items>' as CLOB
                                        )
                           ) as "doc"
          COLUMNS
          ITEM VARCHAR(255) PATH '.'
      ) b 
WHERE b.item <> '';
Anson
  • 27
  • 1
  • 6
0

There is the SYSTOOLS.SPLIT function but she's behaving very strangely... I don't know what's going on... Very slow and buggy with long string as parameter !?!

CHECK> SELECT * FROM TABLE (SYSTOOLS.SPLIT('4182511-S_4182701-X_4182702-X_4182703-X_4182796-S' , '_'))
[2020-10-22 17:04:59] 0 rows retrieved in 86 ms (execution: 51 ms, fetching: 35 ms)
[2020-10-22 17:04:59] [0F001][-423] [SQL0423] Pointeur SQLP_L2.VWORK incorrect.
[2020-10-22 17:04:59] [0F001][-423] [SQL0423] Pointeur SQLP_L2.VWORK incorrect.
[2020-10-22 17:30:29] 5 rows retrieved starting from 1 in 255 ms (execution: 41 ms, fetching: 214 ms)
CHECK> SELECT * FROM TABLE (SYSTOOLS.SPLIT('4182511-S_4182701-X_4182702-X_4182703-X_4182796-S' , '_'))
[2020-10-22 17:30:29] 5 rows retrieved starting from 1 in 251 ms (execution: 39 ms, fetching: 212 ms)
CHECK> SELECT * FROM TABLE (SYSTOOLS.SPLIT('4182511-S_4182701-X_4182702-X_4182703-X_4182796-S' , '_'))
[2020-10-22 17:30:30] 5 rows retrieved starting from 1 in 253 ms (execution: 39 ms, fetching: 214 ms)
CHECK> SELECT * FROM TABLE (SYSTOOLS.SPLIT('4182511-S_4182701-X_4182702-X_4182703-X_4182796-S' , '_'))

DB2 for IBM i

Here is my custom SPLIT function from this one

CREATE OR REPLACE FUNCTION UTILS.Split (String VARCHAR(32000), SEP VARCHAR(5))
RETURNS TABLE (ID INT, VALUE VARCHAR(256))
  LANGUAGE SQL
  DISALLOW PARALLEL
  DETERMINISTIC
  NOT FENCED
  RETURN

  WITH CTE (ID,StartString,StopString) AS
  (
    SELECT 1 AS ID, 1 AS StartString, LOCATE(SEP, String) AS StopString
    FROM SYSIBM.SYSDUMMY1
    WHERE LENGTH(SEP) > 0 AND LENGTH(String) > 0
    UNION ALL
    SELECT ID + 1, StopString + LENGTH(SEP), LOCATE(SEP, String, StopString + LENGTH(SEP))
    FROM CTE
    WHERE StopString > 0
  )
  SELECT
    ID,
    SUBSTRING(String,StartString,
      CASE WHEN StopString = 0
      THEN LENGTH(String)
      ELSE StopString-StartString END
    )
  FROM CTE;

CHECK> SELECT * FROM TABLE (UTILS.SPLIT('4182511-S_4182701-X_4182702-X_4182703-X_4182796-S' , '_'))
[2020-10-22 17:30:32] 5 rows retrieved starting from 1 in 50 ms (execution: 36 ms, fetching: 14 ms)
CHECK> SELECT * FROM TABLE (UTILS.SPLIT('4182511-S_4182701-X_4182702-X_4182703-X_4182796-S' , '_'))
[2020-10-22 17:30:33] 5 rows retrieved starting from 1 in 63 ms (execution: 38 ms, fetching: 25 ms)
T.Rex
  • 111
  • 1
  • 13