0

Query:

DECLARE @BaseStringValues VARCHAR(MAX), 
        @ParamValues VARCHAR(MAX), 
        @UpdatedStringValues VARCHAR(MAX)

SET @BaseStringValues = '12W4,23FE,EU3E,GH56,56Y8,34W2,QWE4'
SET @ParamValues = 'QWE4,EU3E' -- to remove

How can I query in SQL Server to get the output like this:

@UpdatedStringValues - '12W4,23FE,GH56,56Y8,34W2'

In few articles I checked, got advised to use the Replace function, but not sure how to use the replace function in this scenario.

Thanks in advance

Dale K
  • 25,246
  • 15
  • 42
  • 71
Earth
  • 3,477
  • 6
  • 37
  • 78
  • 2
    Please always specify the version(s) of SQL Server you need to support, since solutions can vary, and having to cater to different solutions uses up valuable time of volunteers offering to help you solve your problem. – Aaron Bertrand Aug 26 '23 at 17:55

3 Answers3

5

Easiest way would be to split them both into tabular formats, use an anti semi join and then re-aggregate them - Fiddle.

SELECT STRING_AGG(bs.value, ',')
FROM   STRING_SPLIT(@BaseStringValues, ',') bs
WHERE  NOT EXISTS (SELECT *
                   FROM   STRING_SPLIT(@ParamValues, ',') pv
                   WHERE  pv.value = bs.value);

But if you used a table valued parameter rather than comma delimited lists this saves having to get SQL to convert them into a tabular format first.

Martin Smith
  • 438,706
  • 87
  • 741
  • 845
  • Thanks, I have also tried with _Left Join_ as `SELECT string_agg(bs.value, ',') FROM string_split(@BaseStringValues, ',') bs LEFT JOIN string_split (@ParamValues, ',') pv on pv.value = bs.value WHERE pv.value is null`. Which option(`NOT EXISTS with Inner Query` or `Left Join`) is fast in terms of execution. – Earth Aug 26 '23 at 17:50
  • 2
    @Earth [race your horses](https://ericlippert.com/2012/12/17/performance-rant/) and also see [this](https://sqlperformance.com/2012/12/t-sql-queries/left-anti-semi-join). – Aaron Bertrand Aug 26 '23 at 17:51
  • I prefer `NOT EXISTS` in general rather than doing the outer join and then filtering down a greater row count after the join. Both execution plans use nested loops and end up re-doing the string splitting of `@ParamValues` multiple times though. So if you care about performance you might prefer to materialise the result of at least one of that into a table variable or temp table - or just use a TVP instead of splitting – Martin Smith Aug 26 '23 at 17:57
  • @MartinSmith Thanks, I am going to use a TVP. With respect to TVP, whether `NOT EXISTS` is faster compared to `Join` in terms of execution. Thanks in advance for me to choose a best approach in better execution time. – Earth Aug 26 '23 at 18:03
  • 2
    @Earth Again, [race your _own_ horses](https://ericlippert.com/2012/12/17/performance-rant/). There is no answer to "which is fastest" that will always be true. – Aaron Bertrand Aug 26 '23 at 18:04
3

Martin's answer is perfect if you don't need an explicit guarantee that the string will get reassembled in the same order as it arrived (while you should generally observe that to be the case, that's no guarantee). If you do need that guarantee, then on SQL Server 2022+ (or Azure SQL Database/MI), you can use the new enable_ordinal argument of STRING_SPLIT:

SELECT @BaseStringValues = STRING_AGG(value, ',') 
                           WITHIN GROUP (ORDER BY ordinal)
  FROM
  (
    SELECT value, ordinal
      FROM STRING_SPLIT(@BaseStringValues, ',', 1) -- order matters
  ) AS x
  WHERE NOT EXISTS
  (
    SELECT 1 
      FROM STRING_SPLIT(@ParamValues, ',', 0) -- order doesn't matter
      WHERE value = x.value
  );

If you are on SQL Server 2017 or better, you can use OPENJSON:

SET @BaseStringValues = '["' + REPLACE(
  STRING_ESCAPE(@BaseStringValues, 'json'), ',','","') + '"]';

SELECT @BaseStringValues = STRING_AGG(value, ',')
                           WITHIN GROUP (ORDER BY [key])
  FROM OPENJSON(@BaseStringValues) AS x
  WHERE NOT EXISTS 
  (
    SELECT 1 FROM STRING_SPLIT(@ParamValues, ',')
    WHERE value = x.value
  );

If you are on SQL Server 2016 or lower, you are running an unsupported version and will have to resort to really ugly FOR XML PATH aggregation, which I talked about here, and an order-observing table-valued function to perform the split, which I talked about here.

  • db<>fiddle - now with an older-than-2017 version too.
Aaron Bertrand
  • 272,866
  • 37
  • 466
  • 490
  • In practice I would very surprised to see a Distribute Streams behind a `STRING_SPLIT` operator in the above query, no matter how big the string was, or for the anti-join to be anything other than `Nested Loops`, ergo it's impossible for the result to be out of order https://dbfiddle.uk/PwUGjK2L – Charlieface Aug 27 '23 at 01:04
  • @Charlieface I'd be surprised too, but not guaranteed is not guaranteed. If you need the guarantee, why wouldn't you do it? – Aaron Bertrand Aug 27 '23 at 03:10
  • @Charlieface And for a scalar, you're almost certainly correct. But then if you apply the more basic, non-order-guaranteeing approach to a _table_, it's not hard to generate [a query plan](https://i.stack.imgur.com/rnE8P.png) that could be less trusted to maintain the original order. Whether you can _actually make it do that_ is yet another layer, but again, if you need the guarantee, I'd just use a method that provides that guarantee. – Aaron Bertrand Aug 27 '23 at 14:39
0

I have tried the following approach using Join. Have used Table Valued Parameter as advised by Martin Smith.

Query

SELECT STRING_AGG(bs.value, ',')
FROM #BaseStringValues bs 
LEFT JOIN @ParamValues pv 
    ON pv.value = bs.value 
WHERE pv.value IS NULL
Dale K
  • 25,246
  • 15
  • 42
  • 71
Earth
  • 3,477
  • 6
  • 37
  • 78