0

I need to split message:

 500 Oracle Parkway.Redwood Shores.*.=13

Now I have a bit worked solution for Substr1/2/4

  SELECT '500 Oracle Parkway.Redwood Shores.*.=13' string1,
  REGEXP_SUBSTR('500 Oracle Parkway.Redwood Shores.*.=13','.[^.]+') 
  "SUBSTR1" ,
  replace(REGEXP_SUBSTR('500 Oracle Parkway.Redwood Shores.*.=13','[$.]+
  [^.]+'),'.',null) "SUBSTR2" ,
  REGEXP_SUBSTR('500 Oracle Parkway.Redwood Shores.*.=13','[$.]+.[$.]+[^.]') 
  "SUBSTR3" ,
  REGEXP_SUBSTR('500 Oracle Parkway.Redwood Shores.*.=13','[^=]+$') 
  "SUBSTR4" 
  FROM DUAL;

However Substr3 contains '='. I'd like to have at least '.*.' or ' * '

Could you please give me a hint how to "exclude" any characters (e.g. '=') in regexp?

Any help is much appreciated!

Thank you

Resolved see SUBSTR3.1

      SELECT
     '500 Oracle Parkway.Redwood Shores.*.=13' string1,
      REGEXP_SUBSTR('500 Oracle Parkway.Redwood Shores.*.=13','.[^.]+') 
      "SUBSTR1" ,
      replace(REGEXP_SUBSTR('500 Oracle Parkway.Redwood Shores.*.=13','[$.]+
      [^.]+'),'.',null) "SUBSTR2" ,
      REGEXP_SUBSTR('500 Oracle Parkway.Redwood Shores.*.=13','[$.]+.[$.]+
      [^.]') "SUBSTR3" ,
      REGEXP_SUBSTR('500 Oracle Parkway.Redwood Shores.*.=13','[^.]+',1,3) 
      "SUBSTR3.1" ,
      REGEXP_SUBSTR('500 Oracle Parkway.Redwood Shores.*.=13','[^=]+$') 
      "SUBSTR4" 
      FROM DUAL;
27P
  • 1,183
  • 16
  • 22
  • You're explicitly including the `=` (or any character following `.*.`) with the final part of the pattern, `[^.]`. So just removing that would give you a quick fix. But it would be helpful to explain more about what you're trying to do, and give more example source strings and the output you're trying to achieve for all of them. `substr2` is null in your example. Are you really trying to tokenize the string, and them (maybe) remove the `=` from the final token? – Alex Poole May 10 '17 at 09:52

2 Answers2

2

With much respect to Alex Poole, the regex of the format '[^.]+' fails if one of the elements of the list is missing. It will silently return incorrent data. Please use this form instead. Note I removed the city from the first example. Try it and you may be surprised:

with t (str) as (
  select '500 Oracle Parkway..*.=13' from dual union 
  select 'One Microsoft Way.Redmond.Washington.=27' from dual
)
select str,
  regexp_substr(str, '(.*?)(\.|$)', 1, 1, NULL, 1) as substr1,
  regexp_substr(str, '(.*?)(\.|$)', 1, 2, NULL, 1) as substr2,
  regexp_substr(str, '(.*?)(\.|$)', 1, 3, NULL, 1) as substr3,
  ltrim(regexp_substr(str, '(.*?)(\.|$)', 1, 4, NULL, 1), '=') as substr4
from t;

See here for more info: Split comma separated values to columns in Oracle

Community
  • 1
  • 1
Gary_W
  • 9,933
  • 1
  • 22
  • 40
1

It looks like you're trying to tokenize your source string based on periods, and them (maybe) remove the leading equals sign from the fourth token. The solution you've used for your 'substring3.1' can be used for all of them:

with t (str) as (
  select '500 Oracle Parkway.Redwood Shores.*.=13' from dual
  union all select 'One Microsoft Way.Redmond.Washington.=27' from dual
)
select str,
  regexp_substr(str, '[^.]+', 1, 1) as substr1,
  regexp_substr(str, '[^.]+', 1, 2) as substr2,
  regexp_substr(str, '[^.]+', 1, 3) as substr3,
  ltrim(regexp_substr(str, '[^.]+', 1, 4), '=') as substr4
from t;

STR                                      SUBSTR1              SUBSTR2              SUBSTR3    SUBSTR4
---------------------------------------- -------------------- -------------------- ---------- -------
500 Oracle Parkway.Redwood Shores.*.=13  500 Oracle Parkway   Redwood Shores       *          13     
One Microsoft Way.Redmond.Washington.=27 One Microsoft Way    Redmond              Washington 27     
Alex Poole
  • 183,384
  • 11
  • 179
  • 318
  • @Alex Poole Sorry Alex. :-) I'm on a one-man mission to get people to stop using the `'[^.]+'` string parsing regex format ever since I found the error mentioned in my post in a production report. I have since encapsulated the code into a utility function where developers can pass a string and the number of the element they want and the function returns the element, hiding the regex altogether for simplicity.I posted it someplace here. Join me in spreading the word! – Gary_W May 10 '17 at 14:17
  • @Gary_W - I remember that being brought up before, but this pattern is so common I forget to look for the safer one (or I'm too lazy if I think it might not matter... seems like it does here). I've bookmarked the answer you linked to now *8-) – Alex Poole May 10 '17 at 14:25