0

I need to remove all spaces and special characters between the numbers in the text field in Oracle and Netezza.

Input:
[Any text 00 00 111 1   2222 ,?/!@#$ 33333 any text 123,. 45]. 

Output:
[Any text 0000111222233333 any text 123.45]

Thanks!

Barbaros Özhan
  • 59,113
  • 10
  • 31
  • 55
svs
  • 11
  • 3
  • Does this answer your question? [Regex pattern inside SQL Replace function?](https://stackoverflow.com/questions/21378193/regex-pattern-inside-sql-replace-function) – Beefster Dec 17 '20 at 20:04
  • In Python or Java, I would use positive LOOKAROUND RedEx functions like LOOKBIHIND and LOOKAHEAD: (?<=\d)\W+(?=\d) to remove all non-numeric value only between numbers preserving the rest. I use Oracle view (not stored procedures) and, therefore, have to use RegEx look around functions supported in Oracle and Netezza to isolate the change areas preserving all other spaces and special characters in the TEXT. When I use RegEx LOOKBIHIND and LOOKAHEAD (?<=\d)\W+(?=\d) in REGEX_REPLACE(), nothing happens. – svs Dec 18 '20 at 20:43

1 Answers1

2

You can use

SELECT REGEXP_REPLACE(col,'[^0-9]') AS new_col
  FROM tab

or [:digit:] posix such as

SELECT REGEXP_REPLACE(col,'[^[:digit:]]') AS new_col
  FROM tab

in order to remove all non-digit characters including whitespaces.

Update : Depending your added request

SELECT ID,
       col,       
       CASE
         WHEN REGEXP_INSTR(col, '[[:digit:]]') != 1 THEN
          REPLACE(REGEXP_SUBSTR(col, '[^[:digit:][:punct:]]+'), ' ')
       END || 
       REGEXP_REPLACE(col, '[^0-9]*([0-9]+|$)', '\1') ||
       REGEXP_SUBSTR(REGEXP_REPLACE(col, '[[:punct:] ]'), '[^[:digit:]]+$') AS new_col
  FROM tab

would remove any non-digit character starting from the first digit upto the last one, and leaves the external parts unchanged.

Demo

Barbaros Özhan
  • 59,113
  • 10
  • 31
  • 55
  • Thank you for your response! Sorry, I am new to this website and maybe did not publish my question correctly. I have a text field in the Oracle table (varchar(8000)), which contains the free format text with text (characters) and digits. All spaces and special characters found between the numbers should be removed, keeping the rest of the text unchanged. In Java, regex would be (?<=\d)\W+(?=\d|\-). But it does not work in Oracle. – svs Dec 18 '20 at 02:06
  • Hi @svs. I've updated the answer depending on your current need. Btw, do you mean a text column of CLOB data type, since that length can not be specified for `VARCHAR` data type. – Barbaros Özhan Dec 18 '20 at 09:06
  • 1
    Thank you for your response, but '[^0-9]' or '[^[:digit:]]' patterns in the REGEX_REPLACE function will delete all non-numeric values in the TEXT. I need to remove only spaces and special characters (not letters) only between numbers keeping them between letters and words. RegEx LOOKBIHIND and LOOKAHEAD (?<=\d)\W+(?=\d) do the trick but not in Oracle query. – svs Dec 18 '20 at 20:49
  • I viewed the update and Oracle queries in DEMO, and it is not exactly what I meant. From the INPUT text 'any characters 000 11111.,-2222@#$3333!?4444# any @#$ characters!?' I should get OUTPUT 'any characters 00011111222233334444# any @#$ characters!?' – svs Dec 18 '20 at 21:02
  • Your new workaround is an interesting and very useful solution. Thank you for this idea! And yet, I do need to preserve all spaces and special characters surrounding numbers. – svs Dec 19 '20 at 01:28
  • I made here a more realistic sample: Input: 'Invoice #1 , 0 . 0 for the amount $1 2 ,* 3 has been processed!' Output: 'Invoice #100 for the amount of $123 has been processed!' – svs Dec 19 '20 at 01:28
  • Please note that all white spaces in this string (text) are preserved, and after number 100, the words are preserved as well. Such numeric value insertions could be anywhere in the text field. – svs Dec 19 '20 at 01:29
  • Thank you again! Please let me know how to view your last update. In Demo and in the sample you provided above, I cannot find the new code. – svs Dec 19 '20 at 17:22