1

Suppose I have a string (varchar2) and I want to add a space wherever I have two consecutive a's. So for example: 'graanh' -> 'gra anh'.

OK, this is trivial to do, either with replace or regexp_replace. But both choke on three or more consecutive a's. For example:

SQL> select replace('aaaaa', 'aa', 'a a') from dual;

REPLACE
-------
a aa aa

SQL> select regexp_replace('aaaaa', 'aa', 'a a') from dual;

REGEXP_
-------
a aa aa

This is because the search for the "next" occurrence of the match pattern begins after the end of the "previous" one. Indeed, regexp_count('aaaaa', 'aa') returns 2, not 4.

The desired result is 'aaaaa' -> 'a a a a a'.

The problem, really, is that I don't want to replace the pattern 'aa'. What I do want to replace is the empty string BETWEEN a's on both sides of the empty string. So the description would be something like: Find all occurrences of an empty string with a's on both sides of it, and replace that empty string with a space.

Can this be done with string or regexp functions? Please do not offer solutions with recursive queries, or (the horror!) PL/SQL procedures; I know they can be done, but I am interested in whether there is a "natural" solution, a tool that exists precisely for this question, that I am missing.

Worst case, replace(replace('aaaaa', 'aa', 'a a'), 'aa', 'a a') will do the trick. Is this the best case as well?

Lest you think this is a weird requirement: I am experimenting with different methods of splitting comma-separated strings. Some of the methods have a very nasty habit of ignoring NULL tokens within such strings (that is, from two consecutive commas you don't get a NULL, as you should). So, before using those methods, one would have to parse the input string and add a space, or the string 'NULL', or something, between consecutive commas. Of course, it would be better to add a space after EVERY comma (and at the beginning of the string), and after splitting into tokens, to remove a space from the beginning of each token; but this raised the question I asked above, which I think has merit on its own.

Thank you!

EDIT (13 April 2020) I learned a lot about regular expressions since I first posted this question. Now I know the terminology for this. (I find that the explanation I gave is correct, I just wasn't using technical terms for it.) What we need here, to be able to add spaces everywhere they are needed in a single pass, is lookarounds - and Oracle regular expressions don't support them. END EDIT

  • Ha! When I went to find this post as an FYI to mention a method of parsing a string that allows for nulls, I noticed you edited it. :-) http://stackoverflow.com/a/31464699/2543416 – Gary_W Jul 25 '16 at 14:50
  • What about building your own built-in? with java not PL/SQL of course? – J. Chomel Oct 03 '16 at 14:34
  • @J.Chomel - I could do that (after I learn Java); question is, would it be any more efficient than that double call to `replace`? Somehow I suspect the context switching will be more expensive than calling `replace` twice. Moreover, my main purpose with this question was to learn if there were some string functions or features or techniques that I was missing. It's several months later, I've learned more, and I believe the answer to that more general question is NO. Thanks! –  Oct 03 '16 at 14:48

1 Answers1

2

If you truly find PL/SQL the horror and recursive queries unnatural, then you probably have to go with replace(replace('aaaaa', 'aa', 'a a'), 'aa', 'a a'), as you already have proposed, albeit as "worst case". Why you consider this the worst case remains unclear to me.

René Nyffenegger
  • 39,402
  • 33
  • 158
  • 293
  • To be clear: I use recursive queries all the time, and there is at least one case I know of (implementing Eratosthenes' algorithm) where PL/SQL works a lot better than anything plain-SQL. I meant FOR THIS PROBLEM a recursive CTE would be unnatural, and PL/SQL unnecessary (since there's no way it would allow a simpler/better solution than the "worst case" I already mentioned). I just didn't want to see solutions with FOR loops for this. If you can suggest a recursive CTE or PL/SQL solution simpler than replace(replace()), by all means please suggest it. –  Jul 23 '16 at 11:21
  • I consider it the "worst case" in the sense that any offered solution should be no worse than it (than the "worst case"). The "best case" would be a regexp (or similar) solution that searches for a pattern and allows replacing a sub-pattern within that pattern, as I explained in my post. Sorry if that wasn't clear, I did my best to explain it but perhaps failed. –  Jul 23 '16 at 11:23