This is a question that's been answered before here: How to strip all non-alphabetic characters from string in SQL Server?
I have used it successfully for a previous project, but that was on a mere 100,000 rows. Although not super quick, it did the job.
I now need to work out how to scale this up to cope with hundreds of millions of rows, I have read about inline functions and am wondering if anyone else has managed to crack this.
As an example, It was previously taking 5 minutes to process 500,000 rows as part of a larger query. But now that the data set size has increased to 500,000,000 rows this is now taking almost 3 hours.
The code that was used before in a scalar function was this
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER Function [dbo].[RemoveNonAlphaCharacters](@Temp VarChar(1000))
Returns VarChar(1000)
AS
Begin
Declare @KeepValues as varchar(50)
Set @KeepValues = '%[^a-z0-9]%'
While PatIndex(@KeepValues, @Temp) > 0
Set @Temp = Stuff(@Temp, PatIndex(@KeepValues, @Temp), 1, '')
Return @Temp
End
Any help is much appreciated