2

I've got a column in a table which contains xml type of data but is in a varchar format. An example raw of data is:

  <old_template_code>BEVA30D</old_template_code><old_template_name>Beva
 30D France calls capped
 15p</old_template_name><new_template_code>BEVA24M</new_template_code><new_template_name>Beva 24M France Calls Capped 15p</new_template_name>

I wonder what regex expression do I have to use to retrieve for example 'BEVA30D' from <old_template_code>?

I've tried

          REGEXP_SUBSTR(table.column,
            '<old_template_code>*</old_template_code>') "REGEXPR_SUBSTR"

but it doesn't work.

Maheswaran Ravisankar
  • 17,652
  • 6
  • 47
  • 69
george
  • 3,102
  • 5
  • 34
  • 51

2 Answers2

5

Forget about regexp. Use the native XMLType functionality ...

select extractValue(xmlparse(content T.column), '/old_template_code')
from table T;

In an example based on your sample XML data:

with source$ as (
    select q'{
<old_template_code>BEVA30D</old_template_code><old_template_name>Beva
 30D France calls capped
 15p</old_template_name><new_template_code>BEVA24M</new_template_code><new_template_name>Beva 24M France Calls Capped 15p</new_template_name>
}' as source_xml
    from dual
)
select extractValue(xmlparse(content source_xml), '/old_template_code')
from source$;

gives me the value of

BEVA30D

Enjoy.

PS: Moreover, forget about stackoverflow.com, use Oracle documentation. :-)

peter.hrasko.sk
  • 4,043
  • 2
  • 19
  • 34
  • but I need to extract the data from a column in a table. That was just an example row. I need to select table.column from table and get the value – george Nov 12 '14 at 22:18
  • I believe that you **are able** to get an example code snippet and adapt it to your needs. But if not, ... Answer updated. – peter.hrasko.sk Nov 12 '14 at 22:21
  • 1
    Nice! +1 for the example. One possible issue though is the varchar field that's supposed to contain valid XML will almost certainly have some non-xml garbage that might cause an XML parser issues. – tbone Nov 13 '14 at 02:14
4

XML approach is definitely the better one always.

Still, A regular expression approach.

select regexp_replace(regexp_substr(table.column,'<old_template_code>.*</old_template_code>'),
                      '(<old_template_code>)(.*)(</old_template_code>)',
                       '\2')
from table;
Maheswaran Ravisankar
  • 17,652
  • 6
  • 47
  • 69
  • 1
    I accept this answer simply because the xml parser fails after the 150 row probably because some of the xml data is incorrect. However, this approach did not fail. – george Nov 12 '14 at 22:40
  • I started out of interest to crack this with simple regex. I know the cons as well. – Maheswaran Ravisankar Nov 12 '14 at 23:45
  • @user3793865, in your question you are not mentioning anything about possibly garbage'd data in your XML column, nor are you giving us an example of such garbage'd data. – peter.hrasko.sk Nov 13 '14 at 07:16