1

Lets say I have four tables:

CREATE TABLE dealers (
    id SERIAL PRIMARY KEY,
    name TEXT UNIQUE NOT NULL
)

CREATE TABLE brands (
    id SERIAL PRIMARY KEY,
    name TEXT UNIQUE NOT NULL
)

CREATE TABLE cars (
    id SERIAL PRIMARY KEY,
    brand INTEGER REFERENCES (brands.id) NOT NULL,
    name TEXT UNIQUE NOT NULL
)

CREATE TABLE sells (
    id SERIAL,
    dealer INTEGER REFERENCES (dealers.id) NOT NULL,
    car INTEGER REFERENCES (cars.id) NOT NULL
)

If n is the number of dealers and m in the number of dealers, how to I create a n x m matrix in sql that displays the number of cars a dealer sells of a brand.

The output should look something like this:

Dealer    |Honda    Toyota    Tesla    
CarMax     103      204       1
CheapCars  160      320       0
GoodCars   40       20        2
OCHonda    201      0         0  

What is the simplest, most efficient query I can use to accomplish this task?

EDIT: Changed m to be the number of dealers instead of cars

aztrorisk
  • 161
  • 4
  • 10

3 Answers3

2

I want to suggest that in SQL, you don't really want a matrix. You want the pairs of dealers and brands along with the count. You can get this by doing:

select d.name as dealername, b.name as brandname, count(s.id) as numsales
from brands b cross join
     dealers d left outer join
     cars c
     on c.brand = b.id left outer join
     sells s
     on s.dealer = d.id and s.car = c.id
group by d.name, b.name;

If you really want this as one row per dealer, then you need to pivot the data. If you know the brands, you can do:

select d.name as dealername,
       sum(case when b.name = 'Honda' and s.id is not null then 1 else 0 end) as Honda,
       sum(case when b.name = 'Toyota' and s.id is not null then 1 else 0 end) as Toyota,
       sum(case when b.name = 'Tesla' and s.id is not null then 1 else 0 end) as Tesla
from brands b cross join
     dealers d left outer join
     cars c
     on c.brand = b.id left outer join
     sells s
     on s.dealer = d.id and s.car = c.id
group by d.name, b.name;

This works if you know the brands explicitly that you want. If you don't know the brands, you will have to resort to a dynamic pivot. A regular SQL query returns a fixed set of columns with fixed names.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
2

Basic query

SELECT d.name AS dealer, b.name AS brand, sells
FROM  (
   SELECT s.dealer, c.brand, count(*)::int AS sells
   FROM   sells s
   JOIN   cars  c ON c.id = s.car 
   GROUP  BY 1,2
   ) x
JOIN   brands  b ON b.id = x.brand
JOIN   dealers d ON d.id = x.dealer
ORDER  BY 1,2;

Group first, join details later, that's typically faster.
If you want the number of dealers instead of cars, like you edited later, just use:
count (DISTINCT s.dealer) instead of count(*).

Crosstab

Feed this to crosstab() from the additional Postgres module tablefunc.
Generally, you need to list all brands explicitly because SQL demands to know columns in advance.

SELECT * FROM crosstab(
   'SELECT d.name AS dealer, b.name AS brand, sells
    FROM  (
       SELECT s.dealer, c.brand, count(*)::int AS sells
       FROM   sells s
       JOIN   cars  c ON c.id = s.car
       GROUP  BY 1,2
       ) x
    JOIN   brands  b ON b.id = x.brand
    JOIN   dealers d ON d.id = x.dealer
    ORDER  BY 1,2'

   ,$$VALUES ('Honda'::text), ('Toyota'), ('Tesla')$$  -- add more ...
   ) AS t(dealer text, "Honda" int, "Toyota" int, "Tesla" int); -- add more ...

Ample details and explanation in this related answer:
PostgreSQL Crosstab Query

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
1

Assuming you know the number of brands, then you can use count with case:

select d.name as dealer,
   count(case when b.name = 'Honda' then 1 end) as HondaCount,
   count(case when b.name = 'Toyota' then 1 end) as ToyotaCount,
   count(case when b.name = 'Telsa' then 1 end) as TelsaCount
from sells s
   join dealers d on s.dealer = d.id
   join cars c on s.car = c.id
   join brands b on c.brand = b.id
group by d.name

If there are dealers without any sells and you need to include those in the result, then use an outer join starting from the dealers table:

...
from dealers d
   left join sells s on s.dealer = d.id
   left join cars c on s.car = c.id
   left join brands b on c.brand = b.id
group by d.name
sgeddes
  • 62,311
  • 6
  • 61
  • 83