1

I have the following tables

customers(custid, name)
orders(orderid, custid)
itemsordered(itemid, orderid)
items(itemid, description)

My objective is to each pair of customers who both ordered an item with the same description, retrieve the name of both customers. Eliminate duplicates, don't pair customers with themselves, and include each pair only once. For each pair, return the names in the pair in alphabetical order.

I understand I need to reference two custid's and compare each of their descriptions to each other, something like:

select nameA
from customers nameA
join orders using (custid)
join itemsordered using (orderid)
join item using (itemid)
where (select nameB
from customers
from customers nameA
join orders using (custid)
join itemsordered using (orderid)
join item using (itemid)
where descriptionA = descriptionB

etc.

but am unsure how to proceed. Here is the correct response:

Christina|Janine
Christina|Max
Christina|Teddy
Christina|David
Christina|Rachel
Rachel|Teddy
David|Janine
David|Rachel
David|Teddy
Janine|Rachel
Janine|Teddy
Janine|Max
Max|Teddy

I need some new examples. Most of my guesses consist of subqueries like the following:

select attribute from table
join anotherTable using (somekey)
where table in (select anotherAttribute 
from anotherTable
where....etc.

Any suggestions or direction would be appreciated.

UPDATE: names are unique.

piofusco
  • 229
  • 2
  • 14

2 Answers2

1
WITH t AS (
    SELECT DISTINCT c.name, i.description  -- collapse dupes per name
    FROM   customers    c
    JOIN   orders         USING (custid)
    JOIN   itemsordered   USING (orderid)
    JOIN   items        i USING (itemid)
    )
SELECT DISTINCT t.name, t1.name
FROM   t
JOIN   t t1 USING (description)
WHERE  t.name < t1.name    -- rules out self-joins and duplicate pairs
ORDER  BY 1, 2;

This is similar to what @Clodoaldo posted, except for a couple of important differences.

  • Assuming unique names. The result wouldn't make sense otherwise. Simpler outer SELECT.
    If names are not unique, you need to output custid (additionally).

  • Match on description as per question.

  • Collapse duplicate descriptions per customer immediately. With only few dupes this won't help. With more than a few it improves performance.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • An elegant solution. Very easy to follow! Do the numbers in ORDER BY apply to the columns created in the output (t.name, t1.name)? – piofusco Jul 16 '14 at 02:53
  • @piofusco: Yes, numbers in `ORDER BY` (or `GROUP BY`) are positional references to items in the `SELECT` list. Related answer: http://stackoverflow.com/questions/19848930/group-by-case-statement/19849537#19849537 – Erwin Brandstetter Jul 16 '14 at 03:17
0
with s as (
    select *
    from
        customers
        inner join
        orders using (custid)
        inner join
        itemsordered using (orderid)
        inner join
        items using (itemid)
)
select distinct
    least(s1.name, s2.name),
    greatest(s1.name, s2.name)
from
    s s1
    inner join
    s s2 on
        s1.description = s2.description
        and
        s1.custid < s2.custid
order by 1, 2
Clodoaldo Neto
  • 118,695
  • 26
  • 233
  • 260
  • The item descriptions need to be the same, not the itemid's. Also, changed item to items. All updated. This works great though. Thank you. Very clear solution - interesting format I am going to take with me. – piofusco Jul 16 '14 at 02:48
  • @piofusco You mean there is more than one itemid with the same description? Are they the same item? If they are then one of them should be deleted. – Clodoaldo Neto Jul 16 '14 at 10:12
  • There is more than one itemid with the same description. They are however different items. Each item also has a color attribute, but I didn't include this to avoid confusion. – piofusco Jul 16 '14 at 15:50