How to find specific VALUE to specific STRING location in different strings

Question

I have a sting like this
,x,x,y,x,x,O,x,y
that matches to its values in an other sting like this
0~1~b~~z~XY~1~7.
The value "O" can switch its position in the string, as will the value in the other string.
The value "O" is in the 6th position so the expectet result would be "XY".
The first string always begins with a "," before each value. The second string starts first with the value followed by the "~".
"O" is a given value which will not change, so I always have to finde the given value for the position wher "O" is, in the second string.

This ist what I am expexting:

,x,x,y,x,x,O,x,y
0~1~b~~z~XY~1~7

O=XY

,x,O,y,x,x,y,x,y
0~1~b~~z~XY~1~7

O=1

Thank you.

Your first string keeps changing. So how did you store the string in your table? Did you store the different strings like [,x,x,y,x,x, O,x,y] or [,x,O,y,x,x,y,x,y] on different rows? — Gen Wan, Jun 14 '19 at 16:06
why in the first example O=XY? Shouldn't be O=1, sixth value from the second string? Or are you missing something here b~~z? — Valerica, Jun 14 '19 at 16:08
There are diferent strings ind defferent rows. ~~ would have been an empty string, so thats why O=XY. The post from Alan Burstein takes care of all that. Thank you all! — Johnny Spindler, Jun 17 '19 at 08:47

score 2 · Accepted Answer · answered Jun 14 '19 at 16:14

2

Grab a copy of DelimitedSplit8K then you can do this:

DECLARE @string1 VARCHAR(1000) = ',x,x,y,x,x,O,x,y',
        @string2 VARCHAR(1000) = '0~1~b~~z~XY~1~7';

DECLARE @search VARCHAR(1000) = 'O'; -- best as a variable/parameter

SELECT *
FROM dbo.delimitedSplit8K(@string2,'~') AS s
WHERE s.itemNumber = 
(
  SELECT TOP (1) s2.itemNumber -- TOP (1) until we know about dupicates
  FROM   dbo.delimitedSplit8K(@string1,',') AS s2
  WHERE  s2.item = @search
)-1;

Returns:

ItemNumber           Item
-------------------- -------
6                    XY

answered Jun 14 '19 at 16:14

Alan Burstein

7,770
1
15
18

1

Hi Alan, as you are always very interested in performance issues, you might want to look into my answer. I did not expect how badly the XML approach is scaling... +1 for this good and fast answer – Shnugo Jun 17 '19 at 08:57
Hi Alan, to add to my comment above: A solution based on `OPENJSON` is much faster than your splitter solution actually... – Shnugo Jun 17 '19 at 10:46

score 2 · Answer 2 · edited Jun 20 '20 at 09:12

This is a JSON based solution:

Edit: fastest approach (needs v2016) -> removed the XQuery based solution

DECLARE @string1 VARCHAR(1000) = ',x,x,x,x,x,O,x,y',
        @string2 VARCHAR(1000) = '0~1~b~~z~XY~1~7';

SELECT s2.[value]
FROM OPENJSON('["' + REPLACE(@string1,',','","') + '"]') AS s1
CROSS APPLY OPENJSON('["' + REPLACE(@string2,'~','","') + '"]') AS s2 
WHERE s1.[value]='O' AND s2.[key]=s1.[key]-1;

The idea in short: By replacing the delimiters we transform your delimited strings in JSON-arrays and use JSON's abilities to find an element's position and use this position to fetch another element.

UPDATE

I did some performance tests and found, that the XQuery solution is performing rather badly. With a small string of max 10 elements this is okay, but with longer strings (tested with 100 elements) this performs rather slow...

The test code, if interested:

DECLARE @i INT=1
       ,@CountElements INT=5 --Use higher values here
       ,@CountRows INT=500;

DECLARE @tbl TABLE(ID INT IDENTITY, i1t90 INT, s1 VARCHAR(500),s2 VARCHAR(500));

DECLARE @TemplateString VARCHAR(1000);
WHILE @i<@CountElements
BEGIN
    SET @TemplateString = CONCAT(@TemplateString,@i,'~');
    SET @i=@i+1;
END

SET @i=1;
WHILE @i<@CountRows
BEGIN
    INSERT INTO @tbl(i1t90, s1,s2) 
    SELECT
     ISNULL(NULLIF(@i%@CountElements,0),@CountElements)
    ,STUFF(REPLICATE(',x',@CountElements),(ISNULL(NULLIF(@i%@CountElements,0),@CountElements))*2,1,'O')
    ,CONCAT(@TemplateString,@i)

    SET @i=@i+1;
END

DECLARE @d DATETIME2=SYSUTCDATETIME();

SELECT t.*
      ,B.PosO
      ,A.y.value('(/y[sql:column("B.PosO")]/text())[1]','nvarchar(100)') AS FoundValue
INTO #t1
FROM @tbl t 
CROSS APPLY
(       
    SELECT (SELECT CAST('<x>' + REPLACE(s1,',','</x><x>') + '</x>' AS XML))
          ,(SELECT CAST('<y>' + REPLACE(s2,'~','</y><y>') + '</y>' AS XML))) AS A(x,y)
CROSS APPLY(SELECT A.x.value('count(/x[. << (/x[text()="O"])[1]])','int')) B(PosO);

SELECT 'XML based new', DATEDIFF(MILLISECOND,@d,SYSUTCDATETIME());

SET @d=SYSUTCDATETIME();

SELECT *
INTO #t2
FROM @tbl
CROSS APPLY dbo.delimitedSplit8K(s2,'~') AS s
WHERE s.itemNumber = 
(
  SELECT TOP (1) s1.itemNumber -- TOP (1) until we know about dupicates
  FROM   dbo.delimitedSplit8K(s1,',') AS s1
  WHERE  s1.item = 'O'
)-1;

SELECT 'Splitter based',DATEDIFF(MILLISECOND,@d,SYSUTCDATETIME());

SELECT * FROM #t1;
SELECT * FROM #t2;

DROP TABLE #t1;
DROP TABLE #t2;

The splitter-based approach is about 8 times faster in my tests...

UPDATE 2: JSON-splitter (needs v2016+)

This approach is about 5 times faster then the splitter based approach:

SELECT t.*
      ,s2.[key] +1 AS PosO --zero based index
      ,s2.[value] AS FoundValue  
INTO #t3
FROM @tbl t
CROSS APPLY OPENJSON('["' + REPLACE(s1,',','","') + '"]') AS s1
CROSS APPLY OPENJSON('["' + REPLACE(s2,'~','","') + '"]') AS s2 
WHERE s1.[value]='O' AND s2.[key]=s1.[key]-1;

Plusone!. Very good. This is faster - I'm still learning/playing around with JSON in SQL Server. — Alan Burstein, Jun 17 '19 at 14:54

How to find specific VALUE to specific STRING location in different strings

2 Answers2

UPDATE

UPDATE 2: JSON-splitter (needs v2016+)

Linked