Regex to get only last occurrence of each digit in string

Question

I am working with SQL Server and have come to encounter a situation for which I have no answer. It is as follows:

Given a string diuuuu, for example, I would like this output diu. The logic behind it is the last occurrance of each digit (d, i, u) in order. Other examples:
diudi = udi
dididi = di

A Regex solution would be greatly appreciated. Thank you!

This is looking for the LAST occurrence of each letter. The Regex logic here would be "ignore all characters where the string has a later character that is the same" — MPost, Jun 22 '20 at 14:50
May be a copy of this: https://stackoverflow.com/questions/55160719/get-distinct-characters-from-a-given-string-in-sql-server — jw11432, Jun 22 '20 at 14:52
No. That's T-SQL and not Regex, and it gets the first character. — MPost, Jun 22 '20 at 14:53
... SQL Server *uses* T-sQL @MPost , and SQL Server does *not* natively support REGEX. — Thom A, Jun 22 '20 at 14:55
This post is a little confusing, as the main question references SQL Server but the title references Regex. Not sure which is being asked for here. — MPost, Jun 22 '20 at 15:04
Likely because the OP *thinks* they need Regex to get the result, @MPost , however, (as I mentioned) SQL Server doesn't natively support Regex. That isn't to say that can't use it, however, they would need to implement a CLR function, so any answer would need to include details of how to create one with the required functionality and enable the instance to make use of it. — Thom A, Jun 22 '20 at 15:15

score 1 · Answer 1 · answered Jun 22 '20 at 15:02

This isn't particularly pretty, but it does do the job:

WITH VTE AS(
    SELECT *
    FROM (VALUES('diudi'),('diuuuu'),('dididi'),('ddddd'))V(YourColumn)),
Ranks AS(
    SELECT V.YourColumn,
           CI.C,
           CI.I,
           RANK() OVER (PARTITION BY V.YourColumn ORDER BY CI.I) AS R
    FROM VTE V
         CROSS APPLY (VALUES(REVERSE(V.YourColumn)))R(YourColumn)
         CROSS APPLY (VALUES('d',CHARINDEX('d',R.YourColumn)),
                            ('i',CHARINDEX('i',R.YourColumn)),
                            ('u',CHARINDEX('u',R.YourColumn)))CI(C,I))
SELECT YourColumn,
       REVERSE(CONCAT(MAX(CASE WHEN R = 1 AND I > 0 THEN C END),MAX(CASE WHEN R = 2 AND I > 0 THEN C END),MAX(CASE WHEN R = 3 AND I > 0 THEN C END)))
FROM Ranks
GROUP BY YourColumn;

score 1 · Answer 2 · answered Jun 22 '20 at 15:03

1

This should work:

 (.)(?!.*\1)

The groups returned, concatenated together, give you what you want.

answered Jun 22 '20 at 15:03

MPost

535
2
7

How will the OP use that in SQL Server? – Thom A Jun 22 '20 at 15:12
The request ended: "A Regex solution would be greatly appreciated. Thank you!" Maybe a CLR DLL is being developed. – MPost Jun 22 '20 at 15:17
I would have expected the question to have been tagged with C# then, as the OP should have included the existing code for their CLR function. – Thom A Jun 22 '20 at 15:20

score -1 · Answer 3 · answered Jun 22 '20 at 15:26

-1

Here's a SQL Answer:

CREATE FUNCTION dbo.udf_LastChars(@str VARCHAR(100))
RETURNS VARCHAR(100)
AS
BEGIN

  DECLARE @result VARCHAR(100)
  DECLARE @char CHAR(1)
  DECLARE @i INT = 1
  WHILE (@i <= LEN(@str))
  BEGIN
    
    SET @char = SUBSTRING(@str, @i, 1)
    SET @result = CONCAT(REPLACE(@result, @char, ''), @char)
    SET @i = @i + 1
  END

  RETURN @result
END

answered Jun 22 '20 at 15:26

MPost

535
2
7

Yes, I could make this more efficient, but it's easy to use and understand. – MPost Jun 22 '20 at 15:27

Regex to get only last occurrence of each digit in string

3 Answers3