1

I have two Postgres tables: playlists and tracks.

I've created playlists_tracks which is the many_to_many relationship between playlists and tracks.

Tracks contain multiple columns, but the one I'm looking at is artist.

What is the fastest way to query a list of all playlists that contain at least one track with an artist name containing, say, "The Beatles"? I'm having a hard time wrapping my head around an efficient way to do this.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Max
  • 597
  • 7
  • 21

2 Answers2

1

The cheapest and simplest DISTINCT operation is .. not to multiply rows in the first place.
An EXISTS semi-join does that for your query:

SELECT *
FROM   playlists p
WHERE  EXISTS (
   SELECT -- can be empty for EXISTS
   FROM   playlists_tracks pt
   JOIN   tracks t USING (track_id)
   WHERE  pt.playlist_id = p.playlist_id
   AND    t.artist = 'The Beatles'
   );

Assuming a typical many-to-many implementation like detailed here:

(You could even nest EXISTS instead of the join in the subquery. Not sure if that helps more.)

If tables are big, indexes can help performance. Ideal for the particular query:

  • CREATE INDEX ON tracks (artist, track_id) -- columns in this order
  • CREATE INDEX ON playlists_tracks (track_id, playlist_id) -- columns in this order

Related:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
0

You must join the 3 tables and group by playlist like this:

select p.playlistid, p.name
from playlists p 
inner join playlists_tracks pt on pt.playlistid = p.playlistid
inner join (select * from tracks where artist = 'The Beatles') t on t.trackid = pt.trackid
group by p.playlistid, p.name

or using distinct:

select distinct p.*
from playlists p 
inner join playlists_tracks pt on pt.playlistid = p.playlistid
inner join tracks t on t.trackid = pt.trackid
where t.artist = 'The Beatles'
forpas
  • 160,666
  • 10
  • 38
  • 76