2

I am having trouble writing a query that will return all records where 2 columns have the same value but a different value in a 3rd column. I am looking for the records where the Item_Type and Location_ID are the same, but the Sub_Location_ID is different.

The table looks like this:

+---------+-----------+-------------+-----------------+
| Item_ID | Item_Type | Location_ID | Sub_Location_ID |
+---------+-----------+-------------+-----------------+
|       1 |     00001 |          20 |              78 |
|       2 |     00001 |         110 |             124 |
|       3 |     00001 |         110 |             124 |
|       4 |     00002 |           3 |              18 |
|       5 |     00002 |           3 |              25 |
+---------+-----------+-------------+-----------------+

The result I am trying to get would look like this:

+---------+-----------+-------------+-----------------+
| Item_ID | Item_Type | Location_ID | Sub_Location_ID |
+---------+-----------+-------------+-----------------+
|       4 |     00002 |           3 |              18 |
|       5 |     00002 |           3 |              25 |
+---------+-----------+-------------+-----------------+

I have been trying to use the following query:

SELECT *
FROM Table1
WHERE Item_Type IN (
     SELECT Item_Type
     FROM Table1
     GROUP BY Item_Type
     HAVING COUNT (DISTINCT Sub_Location_ID) > 1
)

But it returns all records with the same Item_Type and a different Sub_Location_ID, not all records with the same Item_Type AND Location_ID but a different Sub_Location_ID.

bicycle_guy
  • 299
  • 4
  • 15

4 Answers4

3

This should do the trick...

-- some test data...
IF OBJECT_ID('tempdb..#TestData', 'U') IS NOT NULL 
BEGIN DROP TABLE #TestData; END;

CREATE TABLE #TestData (
    Item_ID INT NOT NULL PRIMARY KEY,
    Item_Type CHAR(5) NOT NULL,
    Location_ID INT NOT NULL,
    Sub_Location_ID INT NOT NULL 
    );
INSERT #TestData (Item_ID, Item_Type, Location_ID, Sub_Location_ID) VALUES
    (1, '00001',  20,  78),
    (2, '00001', 110, 124),
    (3, '00001', 110, 124),
    (4, '00002',   3,  18),
    (5, '00002',   3,  25);

-- adding a covering index will eliminate the sort operation...
CREATE NONCLUSTERED INDEX ix_indexname ON #TestData (Item_Type, Location_ID, Sub_Location_ID, Item_ID);

-- the actual solution...
WITH
    cte_count_group AS (
        SELECT 
            td.Item_ID,
            td.Item_Type,
            td.Location_ID,
            td.Sub_Location_ID,
            cnt_grp_2 = COUNT(1) OVER (PARTITION BY td.Item_Type, td.Location_ID),
            cnt_grp_3 = COUNT(1) OVER (PARTITION BY td.Item_Type, td.Location_ID, td.Sub_Location_ID)
        FROM
            #TestData td
        )
SELECT 
    cg.Item_ID,
    cg.Item_Type,
    cg.Location_ID,
    cg.Sub_Location_ID
FROM
    cte_count_group cg
WHERE 
    cg.cnt_grp_2 > 1
    AND cg.cnt_grp_3 < cg.cnt_grp_2;
Jason A. Long
  • 4,382
  • 1
  • 12
  • 17
1

I think you can use exists:

select t1.*
from table1 t1
where exists (select 1
              from table1 tt1
              where tt1.Item_Type = t1.Item_Type and
                    tt1.Location_ID = t1.Location_ID and
                    tt1.Sub_Location_ID <> t1.Sub_Location_ID
             );
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
1

Sql server has no vector IN so you can emulate it with a little trick. Assuming '#' is illegal char for Item_Type

SELECT *
FROM Table1
WHERE Item_Type+'#'+Cast(Location_ID as varchar(20)) IN (
     SELECT Item_Type+'#'+Cast(Location_ID as varchar(20))
     FROM Table1
     GROUP BY Item_Type, Location_ID
     HAVING COUNT (DISTINCT Sub_Location_ID) > 1
);

The downsize is the expression in WHERE is non-sargable

Serg
  • 22,285
  • 5
  • 21
  • 48
1

You can use exists :

select t.*
from table t
where exists (select 1 
              from table t1 
              where t.Item_Type = t1.Item_Type and
                    t.Location_ID = t1.Location_ID and
                    t.Sub_Location_ID <> t1.Sub_Location_ID
             );
Yogesh Sharma
  • 49,870
  • 5
  • 26
  • 52
  • This works perfectly. I selected this one as the answer because it is the simplest way to get the job done. Thanks! – bicycle_guy May 17 '19 at 18:44
  • 2
    fyi... If you have an additional row [6,'00002',3,18], which would match an existing record, it will be included in the result. If this will not occur in your data set, then it's a non-issue. – level3looper May 17 '19 at 18:51
  • 1
    @bicycle_guy - This pattern may look simpler but the self join will make it twice as expensive... – Jason A. Long May 17 '19 at 18:53