SQL query regex

Question

I have TableA with ColumnA. ColumnA is VARCHAR that contains values like:

#1002#15#
#1002#16#
#1003#17#
#1003#17#16#
#1004#18#
#1004#18#3#
#1004#18#3#1155#

I need to find repeating numbers in ColumnA, .e.g:

#1002#15#1002#
#1004#18#3#1004#
#1003#17#17#

I tried number of LIKE patterns as described here: https://technet.microsoft.com/en-us/library/ms187489(v=sql.105).aspx

to no avail so far primarily because those repeating digits can be 2-5 chars long..

Sql server does not support regular expressions outside of a CRL function. You better normalize your database and change that columnA to a table instead of [keeping delimited data in a single column.](http://stackoverflow.com/questions/3653462/is-storing-a-delimited-list-in-a-database-column-really-that-bad) — Zohar Peled, May 10 '16 at 08:06
ugh. thank you. the problem is that I'm not owning the structure of the table, so, I need to query what I have in there...cannot change the structure. hm. so, you're essentially saying there is no good way of doing it in SQL server. perhaps I can dump it out to a separate table because it's one time exercise of finding dup numeric values there... — SGM, May 10 '16 at 08:11
You can use iTVF splitter function, it's fast. See detailes http://www.sqlservercentral.com/articles/Tally+Table/72993/ If you have no CREATE permissions, just use it as cte — Serg, May 10 '16 at 08:26

score 4 · Accepted Answer · answered May 10 '16 at 08:23

Since normalizing the database is impossible in your case, you need a different approach.
One option is to use CROSS APPLY to a string splitting function that will take the content of your column and split it to rows, thus enabling you to to query the content of the column.

For this demonstration I've chosen the SplitStrings_XML from Aaron's article:

CREATE FUNCTION dbo.SplitStrings_XML
(
   @List       NVARCHAR(MAX),
   @Delimiter  NVARCHAR(255)
)
RETURNS TABLE
WITH SCHEMABINDING
AS
   RETURN 
   (  
      SELECT Item = y.i.value('(./text())[1]', 'nvarchar(4000)')
      FROM 
      ( 
        SELECT x = CONVERT(XML, '<i>' 
          + REPLACE(@List, @Delimiter, '</i><i>') 
          + '</i>').query('.')
      ) AS a CROSS APPLY x.nodes('i') AS y(i)
   );
GO

Now, create and populate the sample table:

DECLARE @TableA as table
(
    id int identity(1,1),
    ColumnA varchar(100)
)

INSERT INTO @TableA VALUES
('#1002#15#'),
('#1002#16#'),
('#1003#17#'),
('#1003#17#16#'),
('#1004#18#'),
('#1004#18#3#'),
('#1004#18#3#1155#'),
('#1002#15#1002#'),
('#1004#18#3#1004#'),
('#1003#17#17#')

That makes the query quite simple:

SELECT id, ColumnA
FROM @TableA 
CROSS APPLY dbo.SplitStrings_XML(ColumnA, '#')
WHERE Item IS NOT NULL
GROUP BY id, ColumnA
HAVING COUNT(Item) > COUNT(DISTINCT Item)

Results:

id          ColumnA
----------- -----------------
8           #1002#15#1002#
9           #1004#18#3#1004#
10          #1003#17#17#

SQL query regex

1 Answers1