1

I would like to truncate all characters in a column, no matter where they are.

Example: "+49123/4567890(testnumber)"

Should be changed to "491234567890"

Is there a way without doing a replace for each char?

I have tried to replace it with several, but it is very time-consuming.

Martin Smith
  • 438,706
  • 87
  • 741
  • 845
MrUnknown
  • 15
  • 5
  • 1
    `'4912'` etc are all characters too, why aren't they being replaced? If you trim *all* the characters from a string, you just end up with an empty string (`''`). – Thom A Nov 01 '22 at 10:29
  • sorry, i mean that only digits are left over – MrUnknown Nov 01 '22 at 10:31
  • Do you know *what* other characters can appear? Is it, for example, just letters, plus (`+`), parentheses (`()`) and a forward slash (`/`)? – Thom A Nov 01 '22 at 10:32
  • Does this answer your question? [Fastest way to remove non-numeric characters from a VARCHAR in SQL Server](https://stackoverflow.com/questions/106206/fastest-way-to-remove-non-numeric-characters-from-a-varchar-in-sql-server)? Or alternatively [Keep only allowed characters in a string](https://stackoverflow.com/questions/74196809/keep-only-allowed-characters-in-a-string) – Thom A Nov 01 '22 at 10:34
  • no, not really, however the following would be enough for me to cut out: [a-zA-z()/+] – MrUnknown Nov 01 '22 at 10:37
  • What is your SQL Server `@@Version` ? – Stu Nov 01 '22 at 10:50
  • Microsoft SQL Server 2017 (RTM-GDR) (KB4583456) - 14.0.2037.2 (X64) – MrUnknown Nov 01 '22 at 10:57

3 Answers3

2

As you mentioned, if you are expecting only [a-zA-z()/+], you can use the translate function which is available from 2017+

declare @table TABLE (str varchar(max))
insert into @table
select '+49123/4567890(estnumber)'

select replace(translate(str, '/+()abcdefghijklmnopqrstuvwxyz', '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~'), '~', '') digits
from @table

For more complex scenarios where the characters are not known, you can try using recursive CTE on a string column to extract only digits like following query.

;with cte
as (
    select v.txt originalstring
        ,v.txt
        ,convert(varchar(max), '') as digits
        ,1 as lev
    from (
        values ('+49123/4567890(testnumber)')
            ,('+234&*#$%!@#')
        ) v(txt)
    
    union all
    
    select originalstring
        ,stuff(txt, 1, 1, '')
        ,(
            case 
                when left(txt, 1) LIKE '[0-9]'
                    then digits + left(txt, 1)
                else digits
                end
            )
        ,lev + 1
    from cte
    where txt > ''
    )
select originalstring
    ,digits
from (
    select c.originalstring
        ,c.digits
        ,row_number() over (partition by c.originalstring order by lev desc
            ) rn
    from cte c
    ) t
where rn = 1

Output

originalstring                    digits
---------------                   --------
+234&*#$%!@#                      234
+49123/4567890(testnumber)        491234567890
PSK
  • 17,547
  • 5
  • 32
  • 43
1

A set-based option that exists in SQL Server 2017+ is to utilise translate.

You can hopefully adapt the following to your specific use-case:

select col, Replace(Translate(col, r, Replicate('*', Len(r))), '*', '') Newcol
from t
cross apply(values(' ABCDEFGHIJKLMNOPQRSTUVWXYZ/\+()'))r(r);

Example DB<>Fiddle

Stu
  • 30,392
  • 6
  • 14
  • 33
0

Instead of hardcoding the list of "bad" characters you can use a double TRANSLATE to first get the unwanted characters and then plug that back into TRANSLATE.

DECLARE @table TABLE
  (
     str VARCHAR(max)
  )

INSERT INTO @table
SELECT '+49123/4567890(testnumber)    '

DECLARE @CharactersToKeep VARCHAR(30) = '0123456789'

SELECT REPLACE(TRANSLATE(str, bad_chars, REPLICATE('X', LEN(bad_chars + 'X') - 1)), 'X', '')
FROM   @table
       CROSS APPLY (SELECT REPLACE(TRANSLATE(str, @CharactersToKeep, REPLICATE(LEFT(@CharactersToKeep, 1), LEN(@CharactersToKeep))), LEFT(@CharactersToKeep, 1), '')) ca(bad_chars) 
Martin Smith
  • 438,706
  • 87
  • 741
  • 845