0

I need one help in SQL to generate XML file.

I have one table which has one column with multiline row data.

Table Name: table1

Column Name: col1

1 Row Data is like below:

VALUE1
VALUE2

VALUE3
VALUE4

I need to generate below XML file.

<col1>
    <value>VALUE1</value>
    <value>VALUE2</value>
    <value>.</value>
    <value>VALUE3</value>
    <value>VALUE4</value>
</col1>

I have used below query to format the row data.

select 
        level,
        case 
            when regexp_substr(a.col_trimmed_value, '[^' || CHR(10) || ']+',  1, level) is not null
                THEN substr(regexp_substr(a.col_trimmed_value, '[^' || CHR(10) || ']+',  1, level), 1, 200)
            ELSE '.'
            END as ROW_VALUE
from (
    select
        col1              as col_original_value
        ,replace(rtrim(ltrim(replace(col1,CHR(10),'#s1p@2l3t#'),'#s1p@2l3t#'),'#s1p@2l3t#'),'#s1p@2l3t#',CHR(10)) as col_trimmed_value
    from
        table1
    )a
connect by level <=  length ( a.col_trimmed_value ) - length(replace( a.col_trimmed_value, CHR(10)) ) + 1 
;

Expected Result :

LEVEL   ROW_VALUE
==================
1   VALUE1
2   VALUE2
3   .
4   VALUE3
5   VALUE4

Actual Result :

LEVEL   ROW_VALUE
==================
1   VALUE1
2   VALUE2
3   VALUE3
4   VALUE4
5   .

Can someone please help me to solve this issue?

Thank you in advance.

1 Answers1

0

You can use a regular expression pattern that handles empty elements:

select 
        level,
        case 
            when regexp_substr(a.col_trimmed_value, '(.*?)(' || CHR(10) || '|$)', 1, level, null, 1) is not null
                THEN substr(regexp_substr(a.col_trimmed_value, '(.*?)(' || CHR(10) || '|$)', 1, level, null, 1), 1, 200)
            ELSE '.'
            END as ROW_VALUE
from (
...

You can also simplify that a bit so you only apply the regular expression once:

select 
        level,
        coalesce(
          substr(regexp_substr(a.col_trimmed_value, '(.*?)(' || CHR(10) || '|$)', 1, level, null, 1), 1, 200),
          '.')
        as ROW_VALUE
from (
...
LEVEL ROW_VALUE
1 VALUE1
2 VALUE2
3 .
4 VALUE3
5 VALUE4

db<>fiddle

If you end up with exactly the XML you showed in the question then the order doesn't matter; the child <value> nodes are unordered. But you might be adding the level as an attribute or something.


From the output you added as an answer, it looks like you might have CHR(13) in the value too (i.e. CRLF, not just LF). Changing the regex pattern ought to fix that:

regexp_substr(a.col_trimmed_value, '(.*?)(' || CHR(10) || CHR(13) || '?|$)', 1, level, null, 1), 1, 200)

db<>fiddle with and without that change.

Alex Poole
  • 183,384
  • 11
  • 179
  • 318