122

I'm looking how to replace/encode text using RegEx based on RegEx settings/params below:

RegEx.IgnoreCase = True     
RegEx.Global = True     
RegEx.Pattern = "[^a-z\d\s.]+"   

I have seen some examples on RegEx, but confused as to how to apply it the same way in SQL Server. Any suggestions would be helpful. Thank you.

Control Freak
  • 12,965
  • 30
  • 94
  • 145
  • 1
    Hi take a look at this article: http://www.codeproject.com/Articles/42764/Regular-Expressions-in-MS-SQL-Server-2005-2008 – Mohsen Jan 31 '12 at 08:06
  • There's also a fine **TSQL** **+ Windows API** solution at [Robyn Page and Phil Factor's](https://www.simple-talk.com/sql/t-sql-programming/tsql-regular-expression-workbench/) that relies on [VBScript.RegExp](https://msdn.microsoft.com/en-us/library/ee236360%28v=vs.84%29.aspx) class, which, I belieave, is shipped on every Windows version since Windows 2000. – Julio Nobre Jun 06 '16 at 09:35
  • If you absolutely positively need RegEx via TSQL, an option for SQL Server 2016 and above is to [use R services](https://stackoverflow.com/questions/194652/sql-server-regular-expressions-in-t-sql/46536661#46536661). – Dave Mason Oct 09 '17 at 23:43

6 Answers6

134

You do not need to interact with managed code, as you can use LIKE:

CREATE TABLE #Sample(Field varchar(50), Result varchar(50))
GO
INSERT INTO #Sample (Field, Result) VALUES ('ABC123 ', 'Do not match')
INSERT INTO #Sample (Field, Result) VALUES ('ABC123.', 'Do not match')
INSERT INTO #Sample (Field, Result) VALUES ('ABC123&', 'Match')
SELECT * FROM #Sample WHERE Field LIKE '%[^a-z0-9 .]%'
GO
DROP TABLE #Sample

As your expression ends with + you can go with '%[^a-z0-9 .][^a-z0-9 .]%'

EDIT:
To make it clear: SQL Server doesn't support regular expressions without managed code. Depending on the situation, the LIKE operator can be an option, but it lacks the flexibility that regular expressions provides.

hs-
  • 176
  • 2
  • 5
  • 21
Rubens Farias
  • 57,174
  • 8
  • 131
  • 162
  • 10
    @MikeYoung, you're right. This answer incorrectly addresses the `+` quantifier as `{1,2}` when it should take it as `{1, }`. Surprisingly, this worked to the OP. – Rubens Farias Apr 28 '15 at 18:43
  • 2
    This won't works in sql server since it doesn't support regex. – VVN Mar 26 '16 at 04:17
  • 16
    @VVN, `LIKE` is not regex (it's a more limited pattern-matching syntax), so the lack of regex support doesn't mean that this won't work. – Charles Duffy Apr 16 '16 at 23:28
  • 2
    @RubensFarias wouldn't it be nice to update the answer in light of comments from @mike-young? – Sudhanshu Mishra Mar 06 '17 at 02:40
16

Regular Expressions In SQL Server Databases Implementation Use

Regular Expression - Description
. Match any one character
* Match any character
+ Match at least one instance of the expression before
^ Start at beginning of line
$ Search at end of line
< Match only if word starts at this point
> Match only if word stops at this point
\n Match a line break
[] Match any character within the brackets
[^...] Matches any character not listed after the ^
[ABQ]% The string must begin with either the letters A, B, or Q and can be of any length
[AB][CD]% The string must have a length of two or more and which must begin with A or B and have C or D as the second character
[A-Z]% The string can be of any length and must begin with any letter from A to Z
[A-Z0-9]% The string can be of any length and must start with any letter from A to Z or numeral from 0 to 9
[^A-C]% The string can be of any length but cannot begin with the letters A to C
%[A-Z] The string can be of any length and must end with any of the letters from A to Z
%[%$#@]% The string can be of any length and must contain at least one of the special characters enclosed in the bracket

Ravi Makwana
  • 2,782
  • 1
  • 29
  • 41
  • 6
    I am confused by this post. Not all of the above are supported by SQL Server LIKE. But some of them are. Is this a mixture of LIKE expressions and standard regex regular expressions? e.g. < and > and ^ and $ for start and end of line. – Zeek2 Sep 08 '21 at 11:07
  • 8
    @Zeek2 this answer is misleading. `LIKE` (or `PATINDEX` for that matter) only supports the wildcard matches listed [in the docs](https://learn.microsoft.com/en-us/sql/t-sql/language-elements/like-transact-sql?view=sql-server-ver15#arguments), which are just `%, [], [^], _`. – OfirD Dec 26 '21 at 21:30
  • This is very misleading and unclear. Please clarify where these apply, and what restrictions there are. – Andrew Knackstedt Jul 24 '23 at 18:19
  • I have just explaining how Regular Expression use each expression. – Ravi Makwana Jul 25 '23 at 15:09
13

You will have to build a CLR procedure that provides regex functionality, as this article illustrates.

Their example function uses VB.NET:

Imports System
Imports System.Data.Sql
Imports Microsoft.SqlServer.Server
Imports System.Data.SqlTypes
Imports System.Runtime.InteropServices
Imports System.Text.RegularExpressions
Imports System.Collections 'the IEnumerable interface is here  


Namespace SimpleTalk.Phil.Factor
    Public Class RegularExpressionFunctions
        'RegExIsMatch function
        <SqlFunction(IsDeterministic:=True, IsPrecise:=True)> _
        Public Shared Function RegExIsMatch( _
                                            ByVal pattern As SqlString, _
                                            ByVal input As SqlString, _
                                            ByVal Options As SqlInt32) As SqlBoolean
            If (input.IsNull OrElse pattern.IsNull) Then
                Return SqlBoolean.False
            End If
            Dim RegExOption As New System.Text.RegularExpressions.RegExOptions
            RegExOption = Options
            Return RegEx.IsMatch(input.Value, pattern.Value, RegExOption)
        End Function
    End Class      ' 
End Namespace

...and is installed in SQL Server using the following SQL (replacing '%'-delimted variables by their actual equivalents:

sp_configure 'clr enabled', 1
RECONFIGURE WITH OVERRIDE

IF EXISTS ( SELECT   1
            FROM     sys.objects
            WHERE    object_id = OBJECT_ID(N'dbo.RegExIsMatch') ) 
   DROP FUNCTION dbo.RegExIsMatch
go

IF EXISTS ( SELECT   1
            FROM     sys.assemblies asms
            WHERE    asms.name = N'RegExFunction ' ) 
   DROP ASSEMBLY [RegExFunction]

CREATE ASSEMBLY RegExFunction 
           FROM '%FILE%'
GO

CREATE FUNCTION RegExIsMatch
   (
    @Pattern NVARCHAR(4000),
    @Input NVARCHAR(MAX),
    @Options int
   )
RETURNS BIT
AS EXTERNAL NAME 
   RegExFunction.[SimpleTalk.Phil.Factor.RegularExpressionFunctions].RegExIsMatch
GO

--a few tests
---Is this card a valid credit card?
SELECT dbo.RegExIsMatch ('^(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|6(?:011|5[0-9][0-9])[0-9]{12}|3[47][0-9]{13}|3(?:0[0-5]|[68][0-9])[0-9]{11}|(?:2131|1800|35\d{3})\d{11})$','4241825283987487',1)
--is there a number in this string
SELECT dbo.RegExIsMatch( '\d','there is 1 thing I hate',1)
--Verifies number Returns 1
DECLARE @pattern VARCHAR(255)
SELECT @pattern ='[a-zA-Z0-9]\d{2}[a-zA-Z0-9](-\d{3}){2}[A-Za-z0-9]'
SELECT  dbo.RegExIsMatch (@pattern, '1298-673-4192',1),
        dbo.RegExIsMatch (@pattern,'A08Z-931-468A',1),
        dbo.RegExIsMatch (@pattern,'[A90-123-129X',1),
        dbo.RegExIsMatch (@pattern,'12345-KKA-1230',1),
        dbo.RegExIsMatch (@pattern,'0919-2893-1256',1)
mwigdahl
  • 16,268
  • 7
  • 50
  • 64
  • This is in Classic ASP, does it support? I think CLR is only for .NET functions, right? – Control Freak Jan 19 '12 at 15:12
  • 4
    CLR procedures are installed into the SQL Server environment and can be invoked like any other stored procedure or user-defined function, so if Classic ASP can invoke a stored procedure or user-defined function, it can invoke a CLR procedure. – mwigdahl Jan 19 '12 at 15:14
  • 1
    While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes. - [From Review](/review/low-quality-posts/20902268) – Federico klez Culloca Sep 19 '18 at 10:38
  • Thanks @FedericoklezCulloca. This was an old answer and I've updated it accordingly. – mwigdahl Sep 19 '18 at 17:25
  • @mwigdahl thanks for that. I see it's old, but it popped up in a review queue :) – Federico klez Culloca Sep 20 '18 at 06:48
12

Slightly modified version of Julio's answer.

-- MS SQL using VBScript Regex
-- select dbo.RegexReplace('aa bb cc','($1) ($2) ($3)','([^\s]*)\s*([^\s]*)\s*([^\s]*)')
-- $$ dollar sign, $1 - $9 back references, $& whole match

CREATE FUNCTION [dbo].[RegexReplace]
(   -- these match exactly the parameters of RegExp
    @searchstring varchar(4000),
    @replacestring varchar(4000),
    @pattern varchar(4000)
)
RETURNS varchar(4000)
AS
BEGIN
    declare @objRegexExp int, 
        @objErrorObj int,
        @strErrorMessage varchar(255),
        @res int,
        @result varchar(4000)

    if( @searchstring is null or len(ltrim(rtrim(@searchstring))) = 0) return null
    set @result=''
    exec @res=sp_OACreate 'VBScript.RegExp', @objRegexExp out
    if( @res <> 0) return '..VBScript did not initialize'
    exec @res=sp_OASetProperty @objRegexExp, 'Pattern', @pattern
    if( @res <> 0) return '..Pattern property set failed'
    exec @res=sp_OASetProperty @objRegexExp, 'IgnoreCase', 0
    if( @res <> 0) return '..IgnoreCase option failed'
    exec @res=sp_OAMethod @objRegexExp, 'Replace', @result OUT,
         @searchstring, @replacestring
    if( @res <> 0) return '..Bad search string'
    exec @res=sp_OADestroy @objRegexExp
    return @result
END

You'll need Ole Automation Procedures turned on in SQL:

exec sp_configure 'show advanced options',1; 
go
reconfigure; 
go
sp_configure 'Ole Automation Procedures', 1; 
go
reconfigure; 
go
sp_configure 'show advanced options',0; 
go
reconfigure;
go
Zachary Scott
  • 20,968
  • 35
  • 123
  • 205
  • 2
    BTW, it's much faster to destroy and recreate the regex object than to cache and reuse it. We ran 10,000 comparisons with significantly higher numbers reusing the object. – Zachary Scott Jul 19 '16 at 15:26
6
SELECT * from SOME_TABLE where NAME like '%[^A-Z]%'

Or some other expression instead of A-Z

2

A similar approach to @mwigdahl's answer, you can also implement a .NET CLR in C#, with code such as;

using System.Data.SqlTypes;
using RX = System.Text.RegularExpressions;

public partial class UserDefinedFunctions
{
 [Microsoft.SqlServer.Server.SqlFunction]
 public static SqlString Regex(string input, string regex)
 {
  var match = RX.Regex.Match(input, regex).Groups[1].Value;
  return new SqlString (match);
 }
}

Installation instructions can be found here

Fiach Reid
  • 6,149
  • 2
  • 30
  • 34