How to get not duplicate rows in join?

Question

I got some problems with duplicate rows which I don't wanna get.

Hi! I got two tables - tab1, tab2 and I want to join tab2 to tab1 like:

SELECT t1.column_A1, t2.column_B2 
FROM tab1 t1
JOIN
tab2 t2
ON t1.column_A1=t2.column_A2

tab1

| Column A1 | Column B1 | Column C1 |
|  -------- |  -------- |  -------- |
|  Z1       |  Cell 2   |  Cell 3   |
|  Z2       |  Cell 5   |  Cell 6   |

tab2

| Column A2 | Column B2 | Column C2 |
|  -------- |  -------- |  -------- |
|  Z1       |  PW       |  Cell 3   |
|  Z1       |  RW       |  Cell 6   |

For some rows in tab1 there are more than 1 rows in tab2.

The result will be:

| Column A2 | Column B2 | Column C2 |
|  -------- |  -------- |  -------- |
|  Z1       |  PW       |  RE       |
|  Z1       |  RW       |  KS       |

I want to get: if PW - show only one row with PW; if not PW - show only one row with RW

The result should be:

| Column A2 | Column B2 | Column C2 |
|  -------- |  -------- |  -------- |
|  Z1       |  PW       |  RE       |

If you have more rows in a joined table but only want one result you need to aggregate or exclude rows from the second table. In your case it sounds like you want to exclude rows so have a look at joining to a _derived table_. This is where you write another query to generate the data you want and then join those rows to your first table. Have a look at https://stackoverflow.com/questions/26682043/inner-joins-with-derived-tables — Tony, Jun 01 '23 at 12:27

score 1 · Accepted Answer · answered Jun 01 '23 at 12:29

One option is to "sort" rows per each column_a1 by value stored in column_b2 and return rows that rank as the highest.

Sample data:

SQL> WITH
  2     tab1 (column_a1, column_b1, column_c1)
  3     AS
  4        (SELECT 'Z1', 'cell 2', 'cell 3' FROM DUAL
  5         UNION ALL
  6         SELECT 'Z2', 'cell 5', 'cell 6' FROM DUAL),
  7     tab2 (column_a2, column_b2, column_c2)
  8     AS
  9        (SELECT 'Z1', 'PW', 'cell 3' FROM DUAL
 10         UNION ALL
 11         SELECT 'Z1', 'RW', 'cell 6' FROM DUAL
 12         UNION ALL
 13         SELECT 'Z2', 'RW', 'cell 8' FROM DUAL),

Query begins here:

 14     temp
 15     AS
 16        (SELECT t1.column_A1,
 17                t2.column_B2,
 18                ROW_NUMBER () OVER (PARTITION BY t1.column_a1 ORDER BY t2.column_b2) rn
 19           FROM tab1 t1 JOIN tab2 t2 ON t1.column_A1 = t2.column_A2)
 20  SELECT column_a1, column_b2
 21    FROM temp
 22   WHERE rn = 1;

COLUMN_A1    COLUMN_B2
------------ ------------
Z1           PW
Z2           RW

SQL>

score 0 · Answer 2 · answered Jun 01 '23 at 14:03

This is a typical task on tables without primary keys, i.e. with duplication where there is some rule how to fetch the proper unique row.

In your case the rule is

if PW - show only one row with PW; if not PW - show only one row with RW

You implement it using row_number function, partition byon your (duplicated) key column and order by implementing your rule (using decode) so that the order provides the required row as first.

Example

select 
   COLUMN_A2, COLUMN_B2, COLUMN_C2,
   row_number() over (partition by COLUMN_A2
                      order by decode (COLUMN_B2,'PW',1,'RW',2,3),COLUMN_B2) as rn 
from tab2;        


CO CO COLUMN         RN
-- -- ------ ----------
Z1 PW cell 3          1
Z1 RW cell 6          2

The join is the same as you used only adding the rn = 1 predicate to the on clause.

Note that I added COLUMN_B2as a second order by column; this is for the case when neither of your two strings are present so the lowest value is used.

You should always use such order by column list that they together with the partition by column(s) makes a unique key. Than the query provides a deterministic result.

How to get not duplicate rows in join?

2 Answers2