-1

Since SQL Server does not allow comparison of two columns (e.g., where (store_nbr, item_id)), I am having difficulty comparing the queries, especially when the tables have large amounts of data.

I have 3 tables

STORE_ITEM

enter image description here

enter image description here

I have the below Oracle query

FORCE_TAGS_SQL = """UPDATE /*+ dynamic_sampling(0) index(store_item store_item_pk) */ store_item SET user_bit_5 = '1'
                     WHERE (store_nbr, item_id) in (SELECT /*+ dynamic_sampling(0) index(tag_demand tag_demand_ix1) */ d.store_nbr, d.item_id
                                                      FROM store s, tag_demand d
                                                     WHERE s.division_id = 'XYZ'
                                                       AND d.store_nbr = s.store_nbr
                                                       AND d.print_dt = TRUNC(TO_DATE('2022-04-13', 'YYYY-MM-DD')))
                       AND user_bit_5 = '0'
                 """

and below is the converted SQL Server query

FORCE_TAGS_SQL = """UPDATE STORE_ITEM SET user_bit_5 = '1'
                     WHERE (store_nbr) in (SELECT d.store_nbr
                                                      FROM STORE s, tag_demand d
                                                     WHERE s.division_id = 'XYZ'
                                                       AND d.store_nbr = s.store_nbr
                                                       AND d.print_dt = CONVERT(DATETIME, CONVERT(DATE, '2022-04-13')))
                       AND (item_id) in (SELECT  d.item_id
                                                      FROM STORE s, tag_demand d
                                                     WHERE s.division_id = 'XYZ'
                                                       AND d.store_nbr = s.store_nbr
                                                       AND d.print_dt = CONVERT(DATETIME, CONVERT(DATE, '2022-04-13')))
                      AND user_bit_5 = '0'
                 """

Is the SQL Server query same as Oracle query?

Update:

UPDATE  store_item SET user_bit_5 = '1'
FROM    store_item item INNER JOIN tag_demand d
ON      d.store_nbr = item.store_nbr AND d.item_id = item.item_id
WHERE   d.store_nbr in ((SELECT s.store_nbr FROM STORE s WHERE s.division_id = 'XYZ')) AND d.print_dt = CONVERT(DATETIME, CONVERT(DATE, '2022-04-13'))

Update#2:

 FORCE_TAGS_SQL = """UPDATE STORE_ITEM SET user_bit_5 = '1'
                                FROM STORE_ITEM
                                inner join (
                                    SELECT d.store_nbr, d.item_id
                                    FROM store s, tag_demand d
                                    WHERE d.store_nbr = s.store_nbr AND s.division_id = 'XYZ' AND d.print_dt = CONVERT(DATETIME, CONVERT(DATE, '2022-04-13'))
                                ) tagdemand
                                on tagdemand.store_nbr = STORE_ITEM.store_nbr
                                AND tagdemand.item_id = STORE_ITEM.item_id
                                where user_bit_5 = '0'
                    """

Update#3

FORCE_TAGS_SQL = """UPDATE STORE_ITEM SET user_bit_5 = '1'
                            FROM STORE_ITEM
                            INNER JOIN tag_demand
                                ON 
                                    STORE_ITEM.item_id = tag_demand.item_id
                                AND STORE_ITEM.store_nbr = tag_demand.store_nbr
                                AND tag_demand.print_dt = '2022-04-11'
                                AND tag_demand.STORE_NBR in (SELECT store_nbr FROM store WHERE division_id = 'XYZ')
                            where user_bit_5 = '0'
                """
One Developer
  • 99
  • 5
  • 43
  • 103

3 Answers3

1

You probably should just refactor to exists instead of trying to use to subselects for in lists because that will most probably give you a different result.

...
WHERE EXISTS (SELECT *
              FROM ...
              JOIN ...
              WHERE store.store_nbr = store_item.store_nbr
                AND tag_demand.item_id = store_item.item_id
                AND tag_demand.print_dt = ...)
1

By the way, use JOIN instead of a combination of CARTESIAN PRODUCT and RESTRICTION (false joins in WHERE clause "à la" Oracle !). Here is your query rewrited in pure SQL ISO standard 1992 :

UPDATE store_item 
SET    user_bit_5 = '1'
WHERE  (store_nbr, item_id) in (SELECT d.store_nbr, d.item_id
                                FROM   store s
                                       JOIN tag_demand d
                                          ON d.store_nbr = s.store_nbr
                                WHERE  s.division_id = 'XYZ'
                                       AND d.print_dt = '2022-04-13')
       AND user_bit_5 = '0';

One solution is to use a correlated IN :

UPDATE T
SET    user_bit_5 = '1'
FROM   store_item  AS T
WHERE  (store_nbr) in (SELECT d.store_nbr
                       FROM   store s
                              JOIN tag_demand d
                                 ON d.store_nbr = s.store_nbr
                       WHERE  s.division_id = 'XYZ'
                              AND d.print_dt = '2022-04-13'
                              AND d.item_id = T.item_id)
       AND user_bit_5 = '0';

One other is to use JOINs in the UPDATE :

UPDATE T
SET    user_bit_5 = '1'
FROM   store_item AS T
       JOIN tag_demand d
          ON T.store_nbr = d.store_nbr
             AND T.item_id = d. item_id
       JOIN store s
          ON d.store_nbr = s.store_nbr
WHERE  s.division_id = 'XYZ'
       AND d.print_dt = '2022-04-13'
       AND user_bit_5 = '0';

My prefered solution is to use an EXISTS with a double correlation :

UPDATE store_item 
SET    user_bit_5 = '1'
WHERE  EXISTS(SELECT 0/0
              FROM   store s
                     JOIN tag_demand d
                        ON d.store_nbr = s.store_nbr
              WHERE  s.division_id = 'XYZ'
                     AND d.print_dt = '2022-04-13'
                     AND store_itme.store_nbr = d.store_nbr
                     AND store_itme.item_id =  d.item_id)
       AND user_bit_5 = '0'

Some solutions can use an INTERSECT operator but need to know the table desription, especially the PK

All other solution with the CONCAT function or something close to... will conduct to have false positive row values in the resulset.

SQLpro
  • 3,994
  • 1
  • 6
  • 14
  • Great! I have 80 million records in the store_item table. Guess, JOINs in the UPDATE would perform better. When does the where clause applied? After joining all the tables? – One Developer Apr 13 '22 at 16:08
  • Just have a look on the query execution plan. – SQLpro Apr 13 '22 at 16:47
  • 1
    I agree with your prefered way of using exists but I would recommend not to use a division by zero in it. Write 1 or * in it. It is optimzed away by sql server so it should not matter but a division by zero I would consider bad practice. – Johannes Krackowizer Apr 13 '22 at 17:17
  • I like to use a divide by zero in the SELECT clause of the EXISTS to have exactly the reaction of what you say... But it is the way to prove that the SELECT is really ignored because it has no sense at all ! Alas some bad RDBMS reads this part of the query an generates a stupid error ! – SQLpro Apr 13 '22 at 17:31
0

If you want identical logic you can use CONCAT(). I can't tell you how much this will slow the query down.

WHERE (CONCAT(store_nbr, item_id)) in
(SELECT CONCAT(d.store_nbr, d.item_id)
FROM...)

The difference is that your SQL server version is much freer. In a situation where the oracle version will accept (A,1) and (B,2) your SQL server version will also accept (A,2) and (B,1) even if these combinations don't exist in the sub-query.
One simple test would be to check the number of rows returned from the same data. If the SQL server version returns more rows you know that there is a problem.

  • I can't recommend that solution, as it will perform badly... there are much better ways to solve it. – Dale K Apr 13 '22 at 07:02
  • @DaleK Thanks for the comment. I didn't test anything. I noted in the answer that there will be a performance hit. the idea is to be able to run the same logic on the same set of data to see the difference in the output. As noted in the comments it all depends on the application whether the difference between the 2 queries is important or not. –  Apr 13 '22 at 07:31
  • This will **not** do the same thing as the Oracle solution. It will treat the values `store_nbr=12, item_id=3` the same as `store_nbr=1, item_id=23` - which is completely wrong. –  Apr 13 '22 at 13:22
  • @a_horse_with_no_name: You are right, we would have to use a seperator which will not be found in the values to be sure not to mix them up, for example `CONCAT(col1,'$',col2)` –  Apr 13 '22 at 13:27