1

I have the following 2 tables (there are more fields in the real tables):

create table publisher(id serial not null primary key,
name text not null);

create table product(id serial not null primary key,
name text not null,
publisherRef int not null references publisher(id));

Sample data:

insert into publisher (id,name) values (1,'pub1'),(2,'pub2'),(3,'pub3');
insert into product (name,publisherRef) values('p1',1),('p2',2),('p3',2),('p4',2),('p5',3),('p6',3);

And I would like the query to return:

name, numProducts
pub2, 3
pub3, 2
pub1, 1

A product is published by a publisher. Now I need a list of name, id of all publishers which have at least one product, ordered by the total number of products each publisher has.

I can get the id of the publishers ordered by number of products with:

select publisherRef AS id, count(*)
from product
order by count(*) desc;

But I also need the name of each publisher in the result. I thought I could use a subquery like:

select *
from publisher
where id in (
   select publisherRef
   from product
   order by count(*) desc)

But the order of rows in the subquery is lost in the outer SELECT.

Is there any way to do this with a single sql query?

MTilsted
  • 5,425
  • 9
  • 44
  • 76

1 Answers1

2
SELECT pub.name, pro.num_products
FROM  (
   SELECT publisherref AS id, count(*) AS num_products
   FROM   product
   GROUP  BY 1
   ) pro
JOIN   publisher pub USING (id)
ORDER  BY 2 DESC;

db<>fiddle here

Or (since the title mentions "all data") return all columns of the publisher with pub.*. After products have been aggregated in the subquery, you are free to list anything in the outer SELECT.

This only lists publisher which

have at least one product

And the result is ordered by

the total number of products each publisher has

It's typically faster to aggregate the "n"-table before joining to the "1"-table. Then use an [INNER] JOIN (not a LEFT JOIN) to exclude publishers without products.

Note that the order of rows in an IN expression (or items in the given list - there are two syntax variants) is insignificant.

The column alias in publisherref AS id is totally optional to use the simpler USING clause for identical column names in the following join condition.

Aside: avoid CaMeL-case names in Postgres. Use unquoted, legal, lowercase names exclusively to make your life easier.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228