10

If UNION ALL is an addition in T-SQL. What is the equivalent of subtraction?

For example, if I have a table PEOPLE and a table EMPLOYEES. And I know if I remove EMPLOYEES records from PEOPLE I will be left with my companies CONTRACTORS.

Is there a way of doing this that is similar to UNION ALL? One where I don't have to specify any field names? The reason I ask is this is just one hypothetical example. I need to do this several times to many different tables. Assume that the schema of EMPLOYEES and PEOPLE are the same.

Pure.Krome
  • 84,693
  • 113
  • 396
  • 647
BuddyJoe
  • 69,735
  • 114
  • 291
  • 466

7 Answers7

12

You can use the EXCEPT operator to subtract one set from another. Here's a sample of code using EMPLOYEES and PEOPLE temporary tables. You'll need to use the field names with the EXCEPT operator as far as I know.

CREATE TABLE #PEOPLE
(ID INTEGER,
 Name NVARCHAR(50))

CREATE TABLE #EMPLOYEE
(ID INTEGER,
 Name NVARCHAR(50))
GO

INSERT #PEOPLE VALUES (1, 'Bob')
INSERT #PEOPLE VALUES (2, 'Steve')
INSERT #PEOPLE VALUES (3, 'Jim')
INSERT #EMPLOYEE VALUES (1, 'Bob')
GO

SELECT ID, Name
FROM #PEOPLE
EXCEPT 
SELECT ID, Name
FROM #EMPLOYEE
GO

The final query will return the two rows in the PEOPLE table which do not exist in the EMPLOYEE table.

Eric Ness
  • 10,119
  • 15
  • 48
  • 51
  • 1
    Remember that EXCEPT will only give you DISTINCT values. In your case, if you add another Steve with id =2 then you will only get one of them anyway. Of course, ID would probably be a primary key. – Hakan Winther Oct 13 '09 at 10:23
11
SELECT
     P.*
FROM
     People P
LEFT OUTER JOIN Employees E ON
     E.ID = P.ID     -- Or whatever your PK-FK relationship is
WHERE
     E.ID IS NULL

For SQL Server this will probably be the most performant way that you can do it.

Tom H
  • 46,766
  • 14
  • 87
  • 128
  • Two people thought that a solution that returns the right answer was worth voting down without the courtesy of explaining why. When are they going to implement the feature to require a comment for vote-downs? :) – Tom H Dec 17 '08 at 03:21
  • A left outer join returns all records from the left table irrespective of whether a match was found in the second table. It cant be used when you want to get just those records from the first table that are not present in the second table. :) – user20358 Apr 24 '09 at 13:40
  • 3
    Which is why the WHERE clause includes "E.ID IS NULL". Before declaring that something is wrong you should probably understand it and ideally test it. – Tom H Apr 24 '09 at 14:03
  • I agree with you, your solution is faster in SQL server. Please read this post http://www.sqlservercentral.com/Forums/Topic541726-338-1.aspx – Hakan Winther Oct 13 '09 at 10:16
  • This is a good solution, but, in doing some recent performance analysis, I ran into situations where doing this is NOT necessarily the most performant choice. As far as I can tell, the reason is that if the join would result in a LARGE number of joined records, SQL has to perform the join in order to perform the IS NULL test. In those cases, it can be much faster to use a subquery in the where clause like WHERE NOT EXISTS(Select 1 from Employees e where e.id = p.id). It's very dependent on the relative sizes of the two tables. – DarinH Jun 22 '11 at 15:05
  • Absolutely, this is just one tool in the toolbox. Performance testing and analysis should always be done. One thing I've noticed is that generally when there are many other joins in the query, this usually works better than a subquery. – Tom H Jun 22 '11 at 15:13
11

Instead of using UNION, use EXCEPT, ( or INTERSECT to get only records in both ) as described in

msdn EXCEPT Link for Sql2k8

msdn EXCEPT Link for Sql2k5

Charles Bretana
  • 143,358
  • 22
  • 150
  • 216
3
SELECT * FROM Table1
WHERE Table1.Key NOT IN (SELECT Table2.Key FROM Table2 WHERE Table2.Key IS NOT NULL)

Added IS NOT NULL to make people happy.

I would agree with Tom. His version is most likely more efficient. The only possible reason to use mine, might be that it's prettier.

Greg Dean
  • 29,221
  • 14
  • 67
  • 78
1

Unfortunately there is a problem in your design. instead of having two table PEOPLE and CONTRACTOR. You should have a table PEOPLE and another Table TYPE (if some people can have several role another table maybe needed). In your PEOPLE table you make a referece to the TYPE table.

then you requests become

SELECT * from PEOPLE, TYPE
WHERE PEOPLE.type_id = TYPE.id 
AND TYPE.name = 'CONTRACTOR'

SELECT * from PEOPLE, TYPE
WHERE PEOPLE.type_id = TYPE.id 
AND TYPE.name = 'EMPLOYEE'

(untested)

call me Steve
  • 1,709
  • 3
  • 18
  • 31
1

When I compare tables looking for data that isn't in one that is in the other I typically use SQL Division.

select *(or selected matching field) 
from tableA as A
where not exist
(select *(or selected matching field) 
from tableB as B 
where A.key = B.key)

This query will return the results that are in tableA that are not in through the process of division.

select *(or selected matching field) 
from tableA as A
where exist
(select *(or selected matching field) 
from tableB as B 
where A.key = B.key)

This query will return all the rows of data that match in both tables therefore if there is a row data that is in tableA that isn't in tableB that row of data will not be retrieved.

0

I found it is a lot easier to use a tool like SQLMerger to do this for you. The results are displayed in a nicer way and you can go on with whatever you need to do with the data thereafter easily.

www.auisoft.com/SQLMerger <= the tool that makes it easy to compare data

example on comparing two tables: http://auisoft.com/SQLMerger/How-to/visualize-differences-in-2-databases/