3

Background

Currently I have a function to get only numbers (with no leading zeroes) from a string (let's call the two variables @alphaNumeric and @strippedNumber).

It turns out that it is more efficient to have @strippedNumber pre-calculated on the table that I'm querying from than to calculate it on the fly using my function (let's call this dbo.fnFormatNumeric).

Setting up my computed column I tried it as follows:

ALTER TABLE dbo.Data
ADD StrippedNumber AS CONVERT(BIGINT, dbo.fnFormatNumeric(AlphaNumeric))

This is really efficient at doing the conversion of large amounts of data (in the order of 25,000,000+ rows).

The problem comes when trying to upgrade my database (we drop and recreate all functions). So I tried to exclude that function from being dropped or recreated.

Mistake. Now I can't install a new database because it's reliant on a function that hasn't been created.

Ideal scenario

I would like to be able to select all numbers from a string as a computed column on my table without using a function.

Research

Sql Authority Blog suggests a function for use to get numeric values, which this answer on SO references. This is no good as a function causes me problems.

Another answer on SO suggests using LEFT and PATINDEX but this will only get me the first set of numbers and not all of them in the string.

This question on SO is no good as it uses ISNUMERIC to only get strings where all of it is numeric.

I found a blog post where they use PATINDEX to get the numbers from a string, but this is assuming that all the numbers are together.

Inputs and expected outputs:

@alphaNumeric  =>  @strippedNumber
-----------------------------------
FXADJ011016CR  =>   11016
15-June-2016   =>   152016
708014         =>   708014
FXRWECTB       =>   (empty string)

Existing function:

Input @alphaNumeric varchar(255)    

    DECLARE     @strtoCheckLength       Int,
                @strCount               Int,
                @code                   Int,
                @StrippedNumber varchar(255)
    -- Get and set length and loop variables
    Set @strtoCheckLength = Len(@alphaNumeric)
    Set @strCount = (0)
    Set @StrippedNumber = ''
    -- Make sure we only include Numerics
    While @strCount <= @strtoCheckLength
        Begin
            set @code = Ascii(SubString(@alphaNumeric, @strCount, 1))
            If (@code between 48 and 57) 
                Begin
                    set @StrippedNumber = @StrippedNumber + Substring(@alphaNumeric, @strCount, 1)
                End
            Set @strCount = (@strCount + 1)
        End
    -- Remove Leading Zeros
    While (Len(@StrippedNumber) > 0) And (Left(@StrippedNumber, 1) = '0')
        Begin
            Set @StrippedNumber = Right(@StrippedNumber, Len(@StrippedNumber) - 1)
        End
    Return @StrippedNumber
Community
  • 1
  • 1
Ilessa
  • 602
  • 8
  • 27
  • what version of SQL Server are you using? – Tanner Apr 11 '17 at 09:25
  • 3
    You wrote: *Now I can't install a new database because it's reliant on a function that hasn't been created.*. What does your *install a new database* process look like? This feels like it should not be such a big problem to overcome. – user1429080 Apr 11 '17 at 09:31
  • @Tanner we support 2008R2 onwards – Ilessa Apr 11 '17 at 09:38
  • @user1429080 Yes, in theory I could add the function before adding that particular table - but that is a really ugly solution. Install process goes tables -> stored procs -> functions – Ilessa Apr 11 '17 at 09:39
  • 2
    I think the function is the best _(and possibly only)_ way you can achieve this. So you may need to change your install process to ensure dependencies are followed when installing. – Tanner Apr 11 '17 at 09:41
  • Use a DACPAC rather than trying to work around your own inflexible installation process; it has built-in logic for creating objects before their dependencies. – Jeroen Mostert Apr 11 '17 at 09:47
  • I don't think a *working* process is *ugly*. Also with normal tables you have dependencies when you need foreign keys. Tables must exist before foreign keys can be created. Anyway, you could perhaps create a separate schema where you put "prerequsites" for the main schema. Then create the support schema and its objects first. – user1429080 Apr 11 '17 at 09:48
  • 1
    @Tanner after investigating other ways this afternoon to try and avoid having the function as the computed column, other methods actually result in our queries being way too slow to *not* have the function. So I guess we will revise the install process to add those first. Thanks all for the help and learnings. – Ilessa Apr 11 '17 at 15:03

4 Answers4

2

Hopefully this snippet will help you define your computed column.

Essentially, the numbers table is used to break up the string by indexing, then the query casts the output of the concatenation to bigint to remove the leading zeroes before recasting to a varchar(255)

In the production database I would advocate having the numbers table pre-defined and populated. There is a lot of discussion around how to use them on www.SqlServerCentral.com

--set up a numbers table
select top 50 n=identity(int,1,1) into #temp_numbers from syscolumns

/*
FXADJ011016CR => 11016

15June2016 => 152016

708014 => 708014

FXRWECTB => Empty String
*/

declare @input varchar(50) = 'FXADJ011016CR'
declare @output varchar(50)

select isnull(cast(cast((
    select numericValue as [text()] from (
        select substring(@input, t.n, 1) as numericValue 
            from #temp_numbers t 
            where isnumeric(substring(@input, t.n, 1))=1
        ) tblResults for xml path('')
    ) as bigint) as varchar(255)),'')

drop table #temp_numbers
Duncan Howe
  • 2,965
  • 19
  • 18
  • Note: this hasn't been tested thoroughly. I'll leave that to you – Duncan Howe Apr 11 '17 at 10:23
  • 2
    Looks promising but if we have a persisted numbers table then doesn't the snippet violate the condition that "You cannot directly reference other tables in computed columns." ? – Ilessa Apr 11 '17 at 13:05
  • 1
    Even if you eliminate the numbers table, any attempt to implement this as a computed column will fail, as it would violate the condition that "Subqueries are not allowed in this context. Only scalar expressions are allowed." (error 1046) – Ed Harper Apr 11 '17 at 14:26
  • Yes - although it has helped execute this in a slightly different manner - not as a computed column but as a temporary table elsewhere – Ilessa Apr 11 '17 at 14:44
  • 2
    Unfortunately IsNumeric will also pick up various special characters - so for example a dash in the string will cause it to fail. – RTPeat Apr 11 '17 at 14:45
  • 1
    I knew there were limitations to the IsNumeric function but couldn't recall them at the time. Maybe changing the where clause to something like `substring(@input, t.n, 1) like '[0-9]'` would be more robust – Duncan Howe Apr 11 '17 at 14:55
  • Yes I ended up modifying it slightly using `like '[0-9]'` – Ilessa Apr 11 '17 at 14:59
  • @Ed Harper I shall make a note of revising the constraints of computed columns. I haven't done much with them personally, and hadn't tried my solution in that context. It's on my list of SQL things to learn – Duncan Howe Apr 11 '17 at 15:00
  • Cool - glad it helped you out. Apologies that it made you change your approach. – Duncan Howe Apr 11 '17 at 15:01
  • @DuncanHowe no worries, it gave us some food for thought and helped us weigh up the various options, Thanks – Ilessa Apr 11 '17 at 15:07
0

You could use this

DECLARE @SampleData AS TABLE
(
   Value varchar(100)
)
INSERT INTO @SampleData
VALUES  ('FXADJ011016CR'),('15June2016'), 
      ('708014 '), ('FXRWECTB ')

DECLARE @RegexNonNumber VARCHAR(30) = '%[^0-9]%'

;WHILE (EXISTS (SELECT 1 FROM @SampleData sd WHERE PatIndex(@RegexNonNumber, sd.[Value]) > 0))
BEGIN
    UPDATE @SampleData
    SET
       [Value] = Stuff([Value], PatIndex(@RegexNonNumber, [Value]), 1, '')
    WHERE PatIndex(@RegexNonNumber, [Value]) > 0

END

SELECT * FROM @SampleData sd

Demo link: Rextester

TriV
  • 5,118
  • 2
  • 10
  • 18
0

Use Replace functions for replacing the characters with empty values.

as next:-

select cast(replace(replace(replace(replace(replace(replace(replace
(replace(replace(replace(replace(replace(replace(replace(replace
(replace(replace(replace(replace(replace(replace(replace(replace
(replace(replace(replace(replace(replace(replace
(@str,'A',''),'B',''),'C',''),'D',''),'E',''),'F',''),'G',''),'H',''),'I',''),'J','')
,'K',''),'L',''),'M',''),'N',''),'O',''),'P',''),'Q',''),'R',''),'S',''),'T','')
,'U',''),'V',''),'W',''),'X',''),'Y',''),'Z',''),'$',''),',',''),' ','') as bigint)

Note:-

This solution is ugly and so bad with performance, but I posted it as the last solution and hope getting a better than it form the others answers.

ahmed abdelqader
  • 3,409
  • 17
  • 36
  • 1
    That would work for strings with only basic characters, but as a solution it rapidly gets unwieldy if the string contains unicode characters. – RTPeat Apr 11 '17 at 09:59
0

If you have 25,000,000 records, then I would suggest that it's better to Denormalize the table.

Add a standard bigint column.

ALTER TABLE dbo.Data
ADD StrippedNumber AS BIGINT NULL

And UPDATE the column with data.

UPDATE dbo.Data SET StrippedNumber = CONVERT(BIGINT, dbo.fnFormatNumeric(AlphaNumeric))

The only downside to this solution is that all inserts and updates must manually update the StrippedNumber column with data from AlphaNumeric column.

bastos.sergio
  • 6,684
  • 4
  • 26
  • 36