34

I have a large database in which I want to do a part string search. The user will enter characters: JoeBloggs.

For arguments sake if I had a name Joe 23 Blo Ggs 4 in the database. I want to remove everything in the name other than A-Z.

I have the REPLACE(Name, ' ','') function to remove spaces and the UPPER() function to capitalize the name.

Is there a more efficient fast way maybe by terms of regex to replace anything other than A-Z. I cannot change the values in the database.

SecretAgentMan
  • 2,856
  • 7
  • 21
  • 41
CR41G14
  • 5,464
  • 5
  • 43
  • 64
  • You say you can't change the data. Can you add a computed column to the existing table? Or add a new table with a foreign key and your computed value? – Laurence Nov 05 '12 at 21:04
  • @Laurence - Yea I suppose I can do this but can we do this in a temporary table in a stored procedure? If not then I can request this computed column, all i need is the function to do the replace. Thanks for your prompt response – CR41G14 Nov 05 '12 at 21:07
  • I'd not recommend using regexps if performance matters. The way you use is obvious but still a good one! – vyakhir Nov 05 '12 at 21:13
  • @DmitryVyakhirev - There could be some rouge characters like ' " @ /, it is a messy database so will I need to do multiple REPLACES to get rid of every instance of a bad character – CR41G14 Nov 05 '12 at 21:19

11 Answers11

50

1st option -

You can nest REPLACE() functions up to 32 levels deep. It runs fast.

REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE (@str, '0', ''),
'1', ''),
'2', ''),
'3', ''),
'4', ''),
'5', ''),
'6', ''),
'7', ''),
'8', ''),
'9', '')

2nd option -- do the reverse of -

Removing nonnumerical data out of a number + SQL

3rd option - if you want to use regex

then http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=27205

Brien Foss
  • 3,336
  • 3
  • 21
  • 31
Jatin
  • 1,668
  • 2
  • 16
  • 23
23

This one works for me

CREATE Function [dbo].[RemoveNumericCharacters](@Temp VarChar(1000))
Returns VarChar(1000)
AS
Begin

    Declare @NumRange as varchar(50) = '%[0-9]%'
    While PatIndex(@NumRange, @Temp) > 0
        Set @Temp = Stuff(@Temp, PatIndex(@NumRange, @Temp), 1, '')

    Return @Temp
End

and you can use it like so

SELECT dbo.[RemoveNumericCharacters](Name) FROM TARGET_TABLE
Ben Anderson
  • 7,003
  • 4
  • 40
  • 40
7

Try below for your query. where val is your string or column name.

CASE WHEN PATINDEX('%[a-z]%', REVERSE(val)) > 1
                THEN LEFT(val, LEN(val) - PATINDEX('%[a-z]%', REVERSE(val)) + 1)
            ELSE '' END
Sutirth
  • 922
  • 10
  • 14
5

I just started relearning SQL...and I read this somewhere:

SELECT TRIM('0123456789' FROM customername )

You can also use this to remove spaces or other symbols as well. Just include it in the first parameter within the quotes ''.

bad_coder
  • 11,289
  • 20
  • 44
  • 72
Earl Alcoseba
  • 51
  • 1
  • 1
2

One more approach using Recursive CTE..

declare @string varchar(100)
set @string ='te165st1230004616161616'

;With cte
as
(
select @string as string,0  as n
union all
select cast(replace(string,n,'') as varchar(100)),n+1
from cte
where n<9
)
select top 1 string from cte
order by n desc


**Output:**   
  test
TheGameiswar
  • 27,855
  • 8
  • 56
  • 94
2
DecLARE @input varchar(50)= 'Joe 23 Blo Ggs 4'
    SELECT TRANSLATE(@input,'0123456789','          ') As output1,
            REPLACE
            (
                TRANSLATE(@input,'0123456789','          ') /*10 spaces here*/
                , ' '
                ,''
            ) as output2  
  • we can use the TRANSLATE function. State the list of chars to be removed as a continous array and then with what they have to be removed with – Vaibhav Kumar Jun 11 '21 at 17:12
  • This is a good one. It is equivalent of using multiple nested REPLACE functions on a string. You can also use TRIM to remove leading and trailing spaces for output2. ie., TRIM(TRANSLATE(@input,'0123456789',' ')) as output2 – SujaiSparks Jun 27 '22 at 08:38
1

Quoting part of @Jatin answer with some modifications,

use this in your where statement:

    SELECT * FROM .... etc.
        Where 
         REPLACE
        (REPLACE
        (REPLACE
        (REPLACE
        (REPLACE
        (REPLACE
        (REPLACE
        (REPLACE
        (REPLACE
        (REPLACE (Name, '0', ''),
        '1', ''),
        '2', ''),
        '3', ''),
        '4', ''),
        '5', ''),
        '6', ''),
        '7', ''),
        '8', ''),
        '9', '') = P_SEARCH_KEY
sameh.q
  • 1,691
  • 2
  • 23
  • 48
1

Entire SELECT returns string: $P@M from Python. Multiple spaces can be removed using this approach.

SELECT 
    REPLACE
    (
        TRANSLATE
        (
            '$P@M 1244 from93 Python', -- Replace each char from this string
            '123456789',               -- that is met in this set with
            '000000000'                -- the corresponding char from this set.
        ), 
        '0',  -- Now replace all '0'
        ''    -- with nothing (void) (delete it).
    ) AS CLEARED;
endusol
  • 61
  • 1
  • 3
0

Remove everything after first digit (was adequate for my use case):

LEFT(field,PATINDEX('%[0-9]%',field+'0')-1)

Remove trailing digits:

LEFT(field,len(field)+1-PATINDEX('%[^0-9]%',reverse('0'+field))
RF1991
  • 2,037
  • 4
  • 8
  • 17
Aaron West
  • 187
  • 2
  • 5
0

You can use PATINDEX not like ^ alpha numeric - 0-9A-Za-z, keep looping while there are non alpha numeric characters in your string. Within the loop search for the non alpha numeric characters one at a time using the same PATINDEX logic. Once the non alpha numeric character is located inside each loop iteration, use the built in STUFF function to replace your unwanted character with an empty string. Return cleaned string value from your function once outside the WHILE loop.

Create FUNCTION dbo.RemoveNonAlphaNum (@strin VARCHAR(255)) RETURNS VARCHAR(255) AS BEGIN

declare @disposeChar int

WHILE (PATINDEX('%[^ 0-9A-Za-z]%', @strin) > 0)

BEGIN

SELECT @disposeChar = PATINDEX('%[^ 0-9A-Za-z]%', @strin)

SET @strin = STUFF(@strin, @disposeChar, 1, '')

END

RETURN @strin

END


Remove the a-z if you want to remove only numeric

Sample usage

select dbo.RemoveNonAlphaNum('600/7015/8&&££---)))')

AndrewN
  • 1
  • 1
  • Remember that Stack Overflow isn't just intended to solve the immediate problem, but also to help future readers find solutions to similar problems, which requires understanding the underlying code. This is especially important for members of our community who are beginners, and not familiar with the syntax. Given that, **can you [edit] your answer to include an explanation of what you're doing** and why you believe it is the best approach? – Jeremy Caney Jun 13 '23 at 00:40
-1

Not tested, but you can do something like this:

Create Function dbo.AlphasOnly(@s as varchar(max)) Returns varchar(max) As
Begin
  Declare @Pos int = 1
  Declare @Ret varchar(max) = null
  If @s Is Not Null
  Begin
    Set @Ret = ''
    While @Pos <= Len(@s)
    Begin
      If SubString(@s, @Pos, 1) Like '[A-Za-z]'
      Begin
        Set @Ret = @Ret + SubString(@s, @Pos, 1)
      End
      Set @Pos = @Pos + 1
    End
  End
  Return @Ret
End

The key is to use this as a computed column and index it. It doesn't really matter how fast you make this function if the database has to execute it against every row in your large table every time you run the query.

Laurence
  • 10,896
  • 1
  • 25
  • 34
  • very slow solution, fast is required – CR41G14 Nov 05 '12 at 21:36
  • If you insist on doing `Select name from test where dbo.AlphasOnly(name) = 'JoeBloggs'` then this will be slow even if the function takes 0 time. You still pay the cost of reading all the rows off the disk (assuming when you say large table you mean your database doesn't fit in memory). To make it fast you need to find a way of indexing it, which is why I suggest a computed column. – Laurence Nov 05 '12 at 22:45