0

Table and columns of note:

pictures, tags, picture_tags

pictures.id
pictures.created_date

tags.id

picture_tags.tag_id
picture_tags.picture_id

I have the following query using a join table:

SELECT pictures.*, pictures.id as picture_id, tags.id as tag_id
FROM picture_tags
LEFT JOIN pictures ON pictures.id = picture_tags.picture_id
LEFT JOIN tags ON tags.id = picture_tags.tag_id
WHERE picture_tags.tag_id IN (1, 2)
GROUP BY pictures.id, tags.id
ORDER BY pictures.created_date ASC;

Since a picture can have multiple tags, this can return the same picture.id multiple times. Is there a way to prevent this so that picture.ids only show up once?

It is currently returning like this:

id | created_date | picture_id | tag_id
1 | 2022-12-08 19:04:23 | 1 | 1
1 | 2022-12-08 19:04:23 | 1 | 2
2 | 2022-12-09 00:46:30 | 2 | 3

My ideal return would be something like:

picture.created_date | picture.id | tagIds
2022-12-08 19:04:23 | 1 | [ 1, 2 ]
2022-12-09 00:46:30 | 2 | [3]
user2465134
  • 8,793
  • 5
  • 32
  • 46
  • 1
    With `DISTINCT`, the query will not return duplicate result rows (unless you have data corruption perhaps). Please explain your problem with some sample data. – Laurenz Albe Dec 14 '22 at 14:48
  • 1
    What output do you expect when a picture has more than one tag? – clamp Dec 14 '22 at 14:51
  • @clamp ahh I see now. Is there a way that if a tag record returns a picture.id, that no other tag returns that picture.id? Else do I have to sort that out after the query is returned? – user2465134 Dec 14 '22 at 14:56
  • Even with DISTINCT this query can still return the same picture multiple times, because the `tag_id` is still different. So the real question is, if a picture has both tag 1 and tag 2, which do you want to see in the results? Also, it seems like `LEFT JOIN` is wrong here, as the first join is meaningless if there is no match (all NULL fields) and the WHERE clause effectively turns the second JOIN into an INNER JOIN anyway. – Joel Coehoorn Dec 14 '22 at 15:07
  • @clamp Laurenz I updated the question. Hopefully I am more clear – user2465134 Dec 14 '22 at 15:08
  • @JoelCoehoorn sorry, I misunderstood the data (I am new at SQL) and clamp cleared it up for me. I just edited the question for clarity. I let me know if I still need to be more specific. – user2465134 Dec 14 '22 at 15:09
  • 1
    I think you just want to string_Agg the tags... https://stackoverflow.com/questions/15847173/concatenate-multiple-result-rows-of-one-column-into-one-group-by-another-column but combining rows like that generally isn't a good idea. – xQbert Dec 14 '22 at 15:11
  • 2
    combining fields into a single row like that is antithetical to the relational set theory on which SQL databases rely. It's possible, but it's not usually good practice. Instead, you typically want to merge the results in the client code or reporting tool. – Joel Coehoorn Dec 14 '22 at 15:11
  • Ok I will do this after my query is returned from the DB. Thank you Joel and xQbert – user2465134 Dec 14 '22 at 15:15

1 Answers1

1

As I said in a comment, you want to think carefully before combining rows like this. But if you really want to, you can do this:

SELECT p.id, p.created_date, string_agg(pt.tag_id, ',') as tag_ids
FROM pictures p
INNER JOIN picture_tags pt on pt.picture_id = p.id
WHERE pg.tag_id IN (1,2)
GROUP BY p.id, p.created_date
ORDER BY p.created_date

Note I converted the LEFT JOIN to INNER JOIN. In the original query, the first join made no sense as a LEFT JOIN (all the fields would be NULL) and the second join was effectively an INNER JOIN because of the WHERE clause. Additionally, since we didn't actually use any fields from the tag table I was able to remove that join completely.

Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794
  • Thank you for this. I will leave out the string_agg part for the reason you and others stated in the replies. I will also update the INNER JOIN as stated – user2465134 Dec 14 '22 at 15:19