0

I am working with SQL Server and have come to encounter a situation for which I have no answer. It is as follows:

Given a string diuuuu, for example, I would like this output diu. The logic behind it is the last occurrance of each digit (d, i, u) in order. Other examples:
diudi = udi
dididi = di

A Regex solution would be greatly appreciated. Thank you!

boski
  • 2,437
  • 1
  • 14
  • 30
  • This is looking for the LAST occurrence of each letter. The Regex logic here would be "ignore all characters where the string has a later character that is the same" – MPost Jun 22 '20 at 14:50
  • 2
    May be a copy of this: https://stackoverflow.com/questions/55160719/get-distinct-characters-from-a-given-string-in-sql-server – jw11432 Jun 22 '20 at 14:52
  • No. That's T-SQL and not Regex, and it gets the first character. – MPost Jun 22 '20 at 14:53
  • 2
    ... SQL Server *uses* T-sQL @MPost , and SQL Server does *not* natively support REGEX. – Thom A Jun 22 '20 at 14:55
  • This post is a little confusing, as the main question references SQL Server but the title references Regex. Not sure which is being asked for here. – MPost Jun 22 '20 at 15:04
  • Likely because the OP *thinks* they need Regex to get the result, @MPost , however, (as I mentioned) SQL Server doesn't natively support Regex. That isn't to say that can't use it, however, they would need to implement a CLR function, so any answer would need to include details of how to create one with the required functionality and enable the instance to make use of it. – Thom A Jun 22 '20 at 15:15

3 Answers3

1

This isn't particularly pretty, but it does do the job:

WITH VTE AS(
    SELECT *
    FROM (VALUES('diudi'),('diuuuu'),('dididi'),('ddddd'))V(YourColumn)),
Ranks AS(
    SELECT V.YourColumn,
           CI.C,
           CI.I,
           RANK() OVER (PARTITION BY V.YourColumn ORDER BY CI.I) AS R
    FROM VTE V
         CROSS APPLY (VALUES(REVERSE(V.YourColumn)))R(YourColumn)
         CROSS APPLY (VALUES('d',CHARINDEX('d',R.YourColumn)),
                            ('i',CHARINDEX('i',R.YourColumn)),
                            ('u',CHARINDEX('u',R.YourColumn)))CI(C,I))
SELECT YourColumn,
       REVERSE(CONCAT(MAX(CASE WHEN R = 1 AND I > 0 THEN C END),MAX(CASE WHEN R = 2 AND I > 0 THEN C END),MAX(CASE WHEN R = 3 AND I > 0 THEN C END)))
FROM Ranks
GROUP BY YourColumn;
Thom A
  • 88,727
  • 11
  • 45
  • 75
1

This should work:

 (.)(?!.*\1)

The groups returned, concatenated together, give you what you want.

MPost
  • 535
  • 2
  • 7
  • How will the OP use that in SQL Server? – Thom A Jun 22 '20 at 15:12
  • The request ended: "A Regex solution would be greatly appreciated. Thank you!" Maybe a CLR DLL is being developed. – MPost Jun 22 '20 at 15:17
  • I would have expected the question to have been tagged with C# then, as the OP should have included the existing code for their CLR function. – Thom A Jun 22 '20 at 15:20
-1

Here's a SQL Answer:

CREATE FUNCTION dbo.udf_LastChars(@str VARCHAR(100))
RETURNS VARCHAR(100)
AS
BEGIN

  DECLARE @result VARCHAR(100)
  DECLARE @char CHAR(1)
  DECLARE @i INT = 1
  WHILE (@i <= LEN(@str))
  BEGIN
    
    SET @char = SUBSTRING(@str, @i, 1)
    SET @result = CONCAT(REPLACE(@result, @char, ''), @char)
    SET @i = @i + 1
  END

  RETURN @result
END
MPost
  • 535
  • 2
  • 7