-1

I have two strings:

'one two three'

and

'two one three'

While they are not the same if I compare them directly the values that they contain separately are the same and this is what it is important for me. So my answer to the comparison will be that the strings are equal. I am looking for more straight forward way to compare them without the need create function to split and compare each separate value. Is there such solution?

I have found this thread in stackoverflow: How to compare if two strings contain the same words in T-SQL for SQL Server 2008?

It does not work for me because I want to make this comparison on the go in a cte where I make other comparisons.

Yana
  • 785
  • 8
  • 23
  • 1
    You'll need to split the strings, with a string splitter, and then ensure that the value exists in the other string (which is also split). Honestly, T-SQL is far from the best choice of language to d this in. – Thom A Apr 16 '21 at 09:15
  • 1
    `I want to make this comparison on the go in a cte` In that case I would recommend creating a User Defined Function that encapsulates the splitting and comparing. Or make a transformation function that splits, sorts and rejoins, then call it once for each string and compare the results of both transforms -> see `dbo.StringSorter` in the link that you posted yourself. This would fall under my definition of straightforward, or still the least cumbersome way. – Peter B Apr 16 '21 at 09:16
  • What if another string was `two three four` or `one two three four` what result do you want – Charlieface Apr 16 '21 at 09:31
  • @Charlieface This would be false. – Yana Apr 16 '21 at 09:33

3 Answers3

1

You have to split the strings because otherwise how do you compare separate parts.

I'm assuming you want to find all matching pairs of items. I've shown this with a self-join from one table, but you could equally do it from two.

This is a question of Relational Division Without Remainder, for which there are a number of solutions.

DECLARE @t table (val varchar(100));

INSERT @t(col) values('one three two'), ('three   two one'), ('one two    three'), (' one two two    three three   ');


SELECT *
FROM @t t1
JOIN @t t2 ON EXISTS (
    SELECT 1
    FROM STRING_SPLIT(t1.val, ' ') s1
    LEFT JOIN STRING_SPLIT(t2.val, ' ') s2 ON s2.value = s1.value
    HAVING COUNT(CASE WHEN s2.value IS NULL THEN 1) = 0
      AND COUNT(*) = (SELECT COUNT(*) FROM STRING_SPLIT(t2.val, ' '))
);

SELECT *
FROM @t t1
JOIN @t t2 ON (
        SELECT STRING_AGG(s1.value, ' ') WITHIN GROUP (ORDER BY s1.value)
        FROM STRING_SPLIT(t1.val, ' ') s1
    ) = (
        SELECT STRING_AGG(s2.value, ' ') WITHIN GROUP (ORDER BY s2.value)
        FROM STRING_SPLIT(t2.val, ' ') s2
    )
);

Charlieface
  • 52,284
  • 6
  • 19
  • 43
1

It is relatively easy to implement by using T-SQL and XQuery. Specifically by using XQuery's quantified expressions.

Here is how it works:

  1. Converting input world list into XML, i.e. tokenization process.
  2. Running quantified expression. Sequential order of words is irrelevant.
  3. Counting number of words in the source and the target.
  4. Outcome of both (#2 and #3) is the final result.

SQL

-- DDL and sample data population, start
DECLARE @tbl TABLE( ID INT IDENTITY PRIMARY KEY, WordList1 VARCHAR(1024), WordList2 VARCHAR(1024));
INSERT INTO @tbl (WordList1, WordList2) VALUES
('one two three', 'two one three'),
('one two three', 'two   one   three  '),
('one two    three', ' one two two    three three');
-- DDL and sample data population, end

DECLARE @Separator CHAR(1) = SPACE(1);

;WITH rs AS
(
    SELECT *
       , TRY_CAST('<root><source><r>' + REPLACE(WordList1, @Separator, '</r><r>') + '</r></source>'
          + '<target><r>' + REPLACE(WordList2, @Separator, '</r><r>') + '</r></target></root>' AS XML) AS xmldata
    FROM @tbl
)
SELECT * 
    , xmldata.value('every $x in /root/source/r[text()]/text()
                satisfies ($x = (/root/target/r[text()]/text())
              and (count(/root/source/r[text()]) eq count(/root/target/r[text()])))', 'BIT') AS result
FROM rs;

Output

+----+-----------------+---------------------------+--------+
| ID |    WordList1    |         WordList2         | result |
+----+-----------------+---------------------------+--------+
|  1 | one two three   | two one three             |      1 |
|  2 | one two three   | two   one   three         |      1 |
|  3 | one two   three |  one two two  three three |      0 |
+----+-----------------+---------------------------+--------+
Yitzhak Khabinsky
  • 18,471
  • 2
  • 15
  • 21
0

--.... > SQL2017

declare @t table(col varchar(400));
insert into @t(col) values('one three two'), ('three   two one'), ('one two    three'), (' one two two    three three   ');

select *, 
(
/*
select string_agg(value, ' ') within group (order by value)
from openjson(concat('["', replace(string_escape( col, 'json' ) , ' ', '","'), '"]'))
where value <> ''
*/
select string_agg(value, ' ') within group (order by value)
from 
(
select distinct value
from openjson(concat('["', replace(string_escape(col, 'json' ) , ' ', '","'), '"]'))
where value <> ''
) as s
) as rearranged
from @t;
lptr
  • 1
  • 2
  • 6
  • 16