0

There are 2 strings. Those strings may have differences. I need to return the different values "Difference" and the "Position" of those values which are different.

How to find diff between two string in SQL

The above post shows something similar, however, my string isn't delimited by anything so I'm having trouble applying that method. Both strings will always be 24 characters long. However, the differences will vary, so I can't just compare position 1 of String1 to position 1 of String2.

Ideally, representing the variance values less anything which is the same would be better considering that each position has a meaning. However, getting me to just show the difference would be super helpful.

enter image description here

Nahuatl_C137
  • 132
  • 1
  • 13
  • . . SQL is so not designed for this sort of thing. If you had to, you could use a recursive CTE, but it would be messy. – Gordon Linoff Mar 11 '20 at 18:22

1 Answers1

2

This is ugly, but...

Firstly, get yourself a copy of NGrams8K. Then you can do something like this:

DECLARE @String1 varchar(8000) = 'abcd10234619843ab13',
        @String2 varchar(8000) = 'bbcd10234619843ac14';
WITH C AS(
    SELECT @String1 AS String1,
           @String2 AS String2,
           S1.[position],
           S1.token AS C1,
           S2.token AS C2
    FROM dbo.NGrams8k(@String1,1) S1
         JOIN dbo.NGrams8k(@String2,1) S2 ON S1.[position] = S2.position)
SELECT (SELECT '' + C.C2
        FROM C
        WHERE C.C1 != C.C2
        ORDER BY C.[position]
        FOR XML PATH(''),TYPE).value('.','varchar(8000)') AS Difference,
       (SELECT ISNULL(NULLIF(C.C2,C.C1),'-')
        FROM C
        ORDER BY C.[position]
        FOR XML PATH(''),TYPE).value('.','varchar(8000)') AS Ideal,
       STUFF((SELECT CONCAT(',',C.[position])
              FROM C
              WHERE C.C1 != C.C2
              ORDER BY C.[position]
              FOR XML PATH(''),TYPE).value('.','varchar(8000)'),1,1,'') AS Position;

If you're on a more recent, and supported version of SQL Server, this is actually much easier, with only one scan of the values needed:

DECLARE @String1 varchar(8000) = 'abcd10234619843ab13',
        @String2 varchar(8000) = 'bbcd10234619843ac14';
WITH C AS(
    SELECT @String1 AS String1,
           @String2 AS String2,
           S1.[position],
           S1.token AS C1,
           S2.token AS C2
    FROM dbo.NGrams8k(@String1,1) S1
         JOIN dbo.NGrams8k(@String2,1) S2 ON S1.[position] = S2.position)
SELECT STRING_AGG(NULLIF(C.C2,C.C1),'') WITHIN GROUP (ORDER BY C.position) AS Difference,
       STRING_AGG(ISNULL(NULLIF(C.C2,C.C1),'-'),'') WITHIN GROUP (ORDER BY C.position) AS Ideal,
       STRING_AGG(CASE C.C1 WHEN C.C2 THEN NULL ELSE C.[position] END,',') WITHIN GROUP (ORDER BY C.position) AS Position
FROM C
Thom A
  • 88,727
  • 11
  • 45
  • 75