1

I'm making a select in which I give a year (hardcoded as 1981 below) and I expect to get one row per qualifying band. The main problem is to get the oldest living member for each band:

SELECT b.id_band,
    COUNT(DISTINCT a.id_album),
    COUNT(DISTINCT s.id_song),
    COUNT(DISTINCT m.id_musician),
    (SELECT name FROM MUSICIAN WHERE year_death IS NULL ORDER BY(birth)LIMIT 1)
FROM BAND b
    LEFT JOIN ALBUM a ON(b.id_band  = a.id_band)
    LEFT JOIN SONG  s ON(a.id_album = s.id_album)
    JOIN MEMBER m ON(b.id_band= m.id_band)
    JOIN MUSICIAN mu ON(m.id_musician = mu.id_musician)

  /*LEFT JOIN(SELECT name FROM MUSICIAN WHERE year_death IS NULL
              ORDER BY(birth) LIMIT 1) AS alive FROM mu*/ -- ??

WHERE b.year_formed = 1981
GROUP BY b.id_band;

I would like to obtain the oldest living member from mu for each band. But I just get the oldest musician overall from the relation MUSICIAN.

Here is screenshot showing output for my current query:

screenshot of result

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Bardinator
  • 11
  • 1
  • you can use `window function` if you want to get oldest member from each band. – zealous Apr 25 '20 at 19:37
  • Does your design allow the same song (`id_song`) to appear multiple times in one album , or in multiple albums of the same band? If not, this can be much faster ... Also, can there be albums without songs / bands without albums / without members? Please (always) declare your version of Postgres. – Erwin Brandstetter Apr 25 '20 at 23:15

5 Answers5

0

Well, I think you can follow the structure that you have, but you need JOINs in in the subquery.

SELECT b.id_band,
       COUNT(DISTINCT a.id_album),
       COUNT(DISTINCT s.id_song),
       COUNT(DISTINCT mem.id_musician),
       (SELECT m.name
        FROM MUSICIAN m JOIN
             MEMBER mem
             ON mem.id_musician = m.id_musician
        WHERE m.year_death IS NULL AND mem.id_band = b.id_band
        ORDER BY m.birth
        LIMIT 1
       ) as oldest_member
FROM BAND b LEFT JOIN
     ALBUM a
     ON b.id_band  = a.id_band LEFT JOIN
     SONG s
     ON a.id_album = s.id_album LEFT JOIN
     MEMBER mem
     ON mem.id_band = b.id_band
WHERE b.year_formed = 1981       
GROUP BY b.id_band
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0

You can reference a table that is out of this nested select, like so

SELECT b.id_band,
COUNT(DISTINCT a.id_album),
COUNT(DISTINCT s.id_song),
COUNT(DISTINCT m.id_musician),
(SELECT name FROM MUSICIAN WHERE year_death IS NULL ORDER BY(birth) AND 
MUSICIAN.id_BAND = b.id_band LIMIT 1)
FROM BAND b
LEFT JOIN ALBUM a ON(b.id_band  = a.id_band)
LEFT JOIN SONG  s ON(a.id_album = s.id_album)
JOIN MEMBER m ON(b.id_band= m.id_band)
JOIN MUSICIAN mu ON(m.id_musician = mu.id_musician)

/*LEFT JOIN(SELECT name FROM MUSICIAN WHERE year_death IS NULL ORDER 
BY(birth)LIMIT 1) AS alive FROM mu*/
WHERE b.year_formed= 1981       
GROUP BY b.id_band
JohnnyE
  • 1
  • 1
  • Great! Could you edit your answer to account for the fact that MUSICIAN does not contain id_band, but table MEMBER relates each id_musician to an id_band? – Bardinator Apr 25 '20 at 19:54
0

For queries where you want to find the "max person by age" you can use ROW_NUMBER() grouped by the band

SELECT b.id_band,
    COUNT(DISTINCT a.id_album),
    COUNT(DISTINCT s.id_song),
    COUNT(DISTINCT m.id_musician),
    oldest_living_members.*
FROM 
    band b
    LEFT JOIN album a ON(b.id_band  = a.id_band)
    LEFT JOIN song s ON(a.id_album = s.id_album)
    LEFT JOIN 
    (
      SELECT
         m.id_band
         mu.*,
         ROW_NUMBER() OVER(PARTITION BY m.id_band ORDER BY mu.birthdate ASC) rown
       FROM
         MEMBER m
         JOIN MUSICIAN mu ON(m.id_musician = mu.id_musician)
       WHERE year_death IS NULL
     ) oldest_living_members 
     ON 
         b.id_band = oldest_living_members.id_band AND
         oldest_living_members.rown = 1
WHERE b.year_formed= 1981       
GROUP BY b.id_band

If you run just the subquery you'll see how it's working = artists are joined to member to get the band id, and this forms a partition. Rownumber will start numbering from 1 according to the order of birthdates (I didn't know what your column name for birthday was; you'll have to edit it) so the oldest person (earliest birthday) gets a 1.. Every time the band id changes the numbering will restart from 1 with the oldest person in that band. Then when we join it we just pick the 1s

Caius Jard
  • 72,509
  • 5
  • 49
  • 80
0

Following query will give you oldest member of each band group. You can put filter by year_formed = 1981 if you need.

SELECT
    b.id_band,
    total_albums,
    total_songs,
    total_musicians
FROM
(
    SELECT b.id_band,
        COUNT(DISTINCT a.id_album) as total_albums,
        COUNT(DISTINCT s.id_song) as total_songs,
        COUNT(DISTINCT m.id_musician) as total_musicians,
        dense_rank() over (partition by b.id_band order by mu.year_death desc) as rnk
    FROM BAND b
        LEFT JOIN ALBUM a ON(b.id_band  = a.id_band)
        LEFT JOIN SONG  s ON(a.id_album = s.id_album)
        JOIN MEMBER m ON(b.id_band= m.id_band)
        JOIN MUSICIAN mu ON(m.id_musician = mu.id_musician)
    WHERE mu.year_death is NULL
)

where rnk = 1
zealous
  • 7,336
  • 4
  • 16
  • 36
0

I think this should be considerably faster (while also solving your problem):

SELECT b.id_band, a.*, m.*
FROM   band b
LEFT   JOIN LATERAL (
   SELECT count(*) AS ct_albums, sum(ct_songs) AS ct_songs
   FROM  (
      SELECT id_album, count(*) AS ct_songs
      FROM   album a
      LEFT   JOIN song s USING (id_album)
      WHERE  a.id_band = b.id_band
      GROUP  BY 1
      ) ab
   ) a ON true
LEFT   JOIN LATERAL (
   SELECT count(*) OVER () AS ct_musicians
        , name AS senior_member  -- any other columns you need?
   FROM   member   m
   JOIN   musician mu USING (id_musician)
   WHERE  m.id_band  = b.id_band
   ORDER  BY year_death IS NOT NULL  -- sorts the living first
           , birth
           , name  -- as tiebreaker (my optional addition)
   LIMIT  1
   ) m ON true
WHERE  b.year_formed = 1981;

Getting the senior band member is solved in the LATERAL subquery m - without multiplying the cost for the base query. It works because the window function count(*) OVER () is computed before ORDER BY and LIMIT are applied. Since bands naturally only have few members, this should be the fastest possible way. See:

The other optimization for counting albums and songs builds on the assumption that the same id_song is never included in multiple albums of the same band. Else, those are counted multiple times. (Easily fixed, and uncorrelated to the task of getting the senior band member.)

The point is to eliminate the need for DISTINCT at the top level after multiplying rows at the N-side repeatedly (I like to call that "proxy cross join"). That would produce a possibly huge number of rows in the derived table without need.

Plus, it's much more convenient to retrieve additional column (like more columns for the senior band member) than with some other query styles.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228