1

I have a field in a table containing different IDs for different programmes like this:

ProgrammeID
-----------
Prog201604L
Prog201503L
Pro2015N
Pro2014N
Programme2010
Programme2011

Each programme ID has its meaning. The number in the mid of the string indicates the time or month. It is obvious that Prog201604L and Prog201503L indicate the same programme but in different years (so do the rest). What I want to do is to remove the numbers so after removal the programmeID will be like:

ProgrammeID
-----------
ProgL
ProgL
ProN
ProN
Programme
Programme

Then later I can aggregate this programmes together.

I am currently using SSMS 2012 not sure if there is a sql statement like RegEx. I have been searching for a long time but the solution online are mainly about Oracle and MySQL. What I found is PATINDEX() and it seems to support regular expression. Can anybody tell me how to create a pattern that suits my situation and what kind of statement I should use?

Thanks in advance

Lambo
  • 857
  • 3
  • 14
  • 39

6 Answers6

1

You can create a function and pass the value of each row to function as (just run this query)

Create Function [dbo].[RemoveNonAlphaCharacters](@Temp VarChar(1000))
Returns VarChar(1000)
AS
Begin

    Declare @KeepValues as varchar(50)
    Set @KeepValues = '%[^a-z]%'
    While PatIndex(@KeepValues, @Temp) > 0
        Set @Temp = Stuff(@Temp, PatIndex(@KeepValues, @Temp), 1, '')

    Return @Temp
End

---Call it like this:

Declare @tbl table (ProgrammeID varchar(20))
insert into @tbl values ('ProgL'),('ProgL'),('ProN'),('ProN'),('Programme'),('Programme')

select * from @tbl
Select dbo.RemoveNonAlphaCharacters(ProgrammeID) from @tbl

How to strip all non-alphabetic characters from string in SQL Server?

Remove numbers from string sql server

Community
  • 1
  • 1
Ajay2707
  • 5,690
  • 6
  • 40
  • 58
  • thanks for the reply. I'm pretty new to database, so just wondering if I only have a read only access to database, will I be able to create a function? If I'm able to create a function, is the function stored in database forever once I created? Since the SQL I wrote will eventually be used in SSRS, so just want to know how would a function effects that? – Lambo Jun 26 '16 at 22:44
1

One clever option is to take the substring of the ProgrammeID column from the left, until hitting the first number, and concatenate that with the reverse of the substring from the right until hitting the first number:

SELECT
    SUBSTRING(ProgrammeID,
              1,
              PATINDEX('%[0-9]%', ProgrammeID) - 1) +
    REVERSE(SUBSTRING(REVERSE(ProgrammeID),
              1,
              PATINDEX('%[0-9]%', REVERSE(ProgrammeID)) - 1))
FROM yourTable
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
1

I have created a user-defined function for SQL Server to remove non-numeric characters in a string expression

We can modify it to remove the opposite, numeric characters from the input string as follows

 while patindex('%[0-9]%', @str) > 0
  set @str = stuff(@str, patindex('%[0-9]%', @str), 1, '')
 return @str

I hope it helps

Eralper
  • 6,461
  • 2
  • 21
  • 27
  • Hi @Eralper is it possible to integrate this while loop into a select statement (If I'm not going to create a function)? – Lambo Jun 26 '16 at 23:02
1

If the Number part is always 6 characters below can be used.

DECLARE @ProgrammeID VARCHAR(50) = 'Prog201604L'
SELECT STUFF(@ProgrammeID, PATINDEX( '%[0-9]%', @ProgrammeID), 6, '')

If the numbers are not fixed... to extend above

CREATE TABLE #Programme ( ProgrammeID VARCHAR(50) )

INSERT INTO #Programme
VALUES
    ('Prog201604L')
    ,('Pro2015N')
    ,('Programme2010')
    ,('Prog2016L')
    ,('Pro2N')
    ,('Prog')
    ,('2010')

SELECT ProgrammeID, 
    ISNULL(
            STUFF(ProgrammeID, 
                    PATINDEX( '%[0-9]%', ProgrammeID),  -- get number start index   
                    IIF(PATINDEX( '%[0-9][a-z]%',ProgrammeID)= 0, PATINDEX( '%[0-9]',ProgrammeID), PATINDEX( '%[0-9][a-z]%',ProgrammeID)) + 1   -- get the last number index 
                    - PATINDEX( '%[0-9]%', ProgrammeID),    -- get the number character length
                     '')
        ,ProgrammeID)           -- Where there are no numbers in the string you will get Null, replace it with actual string 
        AS [Without Numbers]
FROM #Programme

this will handle cases with varying numbers and even string without number.

Hope this helps

Jatin Patel
  • 2,066
  • 11
  • 13
  • Hi Jatin Patel, thanks for your response. I copied and paste your code into two database one works and the other one gave me an error like iif requires 3 arguments, but obviously it has 3 arguments already. Wonder what the condition is using iif? (I have full access to one of the databases, read only to anther. Does that matter?) – Lambo Jun 27 '16 at 00:28
  • @Lambo I have tried it with a read only login an its executed without error. anyways try changing the IIF condition with CASE WHEN and check if it works for you. – Jatin Patel Jun 27 '16 at 05:58
  • Thanks Jatin, I think I have to use CASE WHEN. Thanks for your suggestion. It works well now. – Lambo Jun 27 '16 at 22:25
1

Alan Burstein wrote an iTVF exactly for this. The function is called PatExclude8K. Here is the function definition (some comments removed):

CREATE FUNCTION dbo.PatExclude8K
(
    @String VARCHAR(8000),
    @Pattern VARCHAR(50)
) 
/*******************************************************************************
 Purpose:
 Given a string (@String) and a pattern (@Pattern) of characters to remove, 
 remove the patterned characters from the string.
*******************************************************************************/
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
WITH
E1(N) AS (SELECT N FROM (VALUES (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) AS X(N)),
itally(N) AS 
(
  SELECT TOP(CONVERT(INT,LEN(@String),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
  FROM E1 T1 CROSS JOIN E1 T2 CROSS JOIN E1 T3 CROSS JOIN E1 T4
) 
SELECT NewString =
((
  SELECT SUBSTRING(@String,N,1)
  FROM iTally
  WHERE 0 = PATINDEX(@Pattern,SUBSTRING(@String COLLATE Latin1_General_BIN,N,1))
  FOR XML PATH(''),TYPE
).value('.[1]','varchar(8000)'));
GO

And here is how you would use it:

SELECT *
FROM #Programme p
CROSS APPLY dbo.PatExclude8K(p.ProgrammeID, '[0-9]');

Using your sample data, here is the result:

ProgrammeID          NewString
-------------------- -----------------
Prog201604L          ProgL
Prog201503L          ProgL
Pro2015N             ProN
Pro2014N             ProN
Programme2010        Programme
Programme2011        Programme
Felix Pamittan
  • 31,544
  • 7
  • 41
  • 67
0

I created this solution building on a solution to extracting values from a comma separated list inside a string. It seems to work find and even be a bit more effective than using while - I will be happy for feedback about that assumption, though.

On on table with 461.358 rows it takes 3 minutes and 27 seconds to do this (0.44 ms per row) (I put it into a function).

select count(*)
from Mytable
where dbo.StripNumeric(inputFromUser) is null

Here's the solutions For stripping away numeric:

declare @input nvarchar(max) = null
select @input = '1a2    3b4' + char(13) + char(10) + '5(678)*90c'
DECLARE @output nvarchar(max) = '';
    WITH cte AS
        (
          SELECT cast(1 as  int) as [index]
            UNION ALL
            SELECT [index]+ 1 as [index]
          from cte
          where [index] < len(@input)  
        )
        select @output =  iif(PATINDEX('%[0-9]%', substring(@input, [index], 1))= 1, @output, @output + substring(@input, [index], 1))
         from cte;
    select iif(COALESCE( @output, '') = '', null, ltrim(rtrim(@output)))

For stripping away non-numeric:

declare @input nvarchar(max) = null
select @input = '1a2    3b4' + char(13) + char(10) + '5(678)*90c'
DECLARE @output nvarchar(max) = '';
    WITH cte AS
        (
          SELECT cast(1 as  int) as [index]
            UNION ALL
            SELECT [index]+ 1 as [index]
          from cte
          where [index] < len(@input)  --len(substring(@input, index, 1)) >
        )
        select @output =  iif(PATINDEX('%[0-9]%', substring(@input, [index], 1))= 1, @output + substring(@input, [index], 1), @output)
        from cte;
    select iif(COALESCE( @output, '') = '', null, ltrim(rtrim(@output)))
iakob
  • 309
  • 1
  • 6