-1

Is there a good or standard SQL method of asserting that a join does not duplicate any rows (produces 0 or 1 copies of the source table row)? Assert as in causes the query to fail or otherwise indicate that there are duplicate rows.

A common problem in a lot of queries is when a table is expected to be 1:1 with another table, but there might exist 2 rows that match the join criteria. This can cause errors that are hard to track down, especially for people not necessarily entirely familiar with the tables.

It seems like there should be something simple and elegant - this would be very easy for the SQL engine to detect (have I already joined this source row to a row in the other table? ok, error out) but I can't seem to find anything on this. I'm aware that there are long / intrusive solutions to this problem, but for many ad hoc queries those just aren't very fun to work out.

EDIT / CLARIFICATION: I'm looking for a one-step query-level fix. Not a verification step on the results of that query.

Walt W
  • 3,261
  • 3
  • 30
  • 37
  • 4
    If there should be only 1 record then that should be enforced by the database in a 1-1 relationship with a unique constraint or by using the FK as the PK. In ALL other cases the assumption of the query should be that there will not be a 1-1 relationship. – HLGEM Sep 27 '11 at 18:41
  • Sometimes you aren't the one who designed the database or have any control over such things. – Walt W Sep 28 '11 at 13:53

3 Answers3

2

If you are only testing for linked rows rather than requiring output, then you'd use EXISTS.

More correctly, you need a "semi-join" but this isn't supported by most RDBMS unless as EXISTS

SELECT a.*
FROM TableA a
WHERE EXISTS (SELECT * FROM TableB b WHERE a.id = b.id)

Also see:

Community
  • 1
  • 1
gbn
  • 422,506
  • 82
  • 585
  • 676
  • This query actually doesn't tell you if a duplicate will be produced at all. – Walt W Sep 27 '11 at 18:39
  • @Walt W: it was hardly clear what you wanted. I mentioned this because it is a common problem to get extra rows when testing existence. Anyway, you'll have problems getting the count and actual data in one go because the count would be an aggregate. There is no standard unless you join again onto tableB and do the count in a derived query. Or a COUNT..OVER to change scope. But I'll leave you to work that out... – gbn Sep 27 '11 at 18:43
  • It was clear enough there were two correct answers at about the same time as this.. – Walt W Sep 27 '11 at 23:34
1
SELECT JoinField
FROM MyJoinTable
GROUP BY JoinField
HAVING COUNT(*) > 1
LIMIT 1

Is that simple enough? Don't have Postgres but I think it's valid syntax.

SingleNegationElimination
  • 151,563
  • 33
  • 264
  • 304
JNK
  • 63,321
  • 15
  • 122
  • 138
  • This is definitely the closest to what I'm looking for - I like the idea of just checking the table you're joining on separately. Hadn't thought of that. – Walt W Sep 27 '11 at 18:28
  • @WaltW - I was hoping it was close. All it checks is if ANY join keys occur more than once. – JNK Sep 27 '11 at 18:28
  • Well, I like this one - it's a good show that you really need to audit the table you're joining onto more than anything else. The actual answer is "There is no simple query-implicit way" – Walt W Sep 27 '11 at 23:34
1

Something along the lines of

SELECT a.id, COUNT(b.id)
FROM TableA a
JOIN TableB b ON a.id = b.id
GROUP BY a.id
HAVING COUNT(b.id) > 1

Should return rows in TableA that have more than one associated row in TableB.

Dan J
  • 16,319
  • 7
  • 50
  • 82