-2

I have many tables to join, some of them have joining columns as array,

When I tried with two tables it worked fine

select 
        tr.report_id,
        tr.sponsor_ids,
        array_agg(tsp.sponsor)
    --from public.tbl_report_page trp
    from  public.tbl_report tr --on trp.report_id = tr.report_id

    left join public.tbl_sponsor tsp on tsp.sponsor_id = any(tr.sponsor_ids)

    --left join public.tbl_tag tg on tg.tag_id = any(tr.tags)

    group by tr.report_id,tr.sponsor_ids--,trp.page_id
    having tr.report_id = 109

This works fine,

109 {1,2,3,4,5,6} "{sponsor_1,sponsor_2,sponsor_3,sponsor_4,sponsor_5,sponsor_6}"

but when I uncomment

left join public.tbl_tag tg on tg.tag_id = any(tr.tags)

getting lots of values in sponsors

109 {1,2,3,4,5,6}   "{sponsor_1,sponsor_1,sponsor_1,sponsor_2,sponsor_2,sponsor_2,sponsor_3,sponsor_3,sponsor_3,sponsor_4,sponsor_4,sponsor_4,sponsor_5,sponsor_5,sponsor_5,sponsor_6,sponsor_6,sponsor_6}"

I have to add more tables which contains ids as array.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Md. Parvez Alam
  • 4,326
  • 5
  • 48
  • 108
  • 2
    You could solve this, and many other future issues, by changing your schema and not using arrays for this. Stick to 3NF and you wouldn't have this issue. One record, one column, one value. – Frank Heikens Jul 13 '23 at 09:46
  • I have to stick with array columns, it's as per requirement – Md. Parvez Alam Jul 13 '23 at 09:47
  • Are you looking for distinct for array_agg like [here](https://stackoverflow.com/questions/26363742/how-to-remove-duplicates-which-are-generated-with-array-agg-postgres-function)? – Umut TEKİN Jul 13 '23 at 11:54

1 Answers1

1

For starters, it's typically superior to replace those arrays with normalized many-to-many relationships in your relational design. See:

While stuck with your current design, consider how joins can multiply rows:

Your query probably works like this:

SELECT tr.report_id
     , tr.sponsor_ids
     , sp.sponsors
     , sp.tags
FROM   public.tbl_report tr
LEFT   JOIN LATERAL (
   SELECT array_agg(DISTINCT tsp.sponsor) AS sponsors
        , array_agg(DISTINCT tg.tag) AS tags
   FROM   public.tbl_sponsor tsp
   LEFT   JOIN public.tbl_tag tg ON tg.tag_id = ANY (tr.tags)
   WHERE  tsp.sponsor_id = ANY (tr.sponsor_ids)   
   ) sp ON true
WHERE  tr.report_id = 109;

Depending on your actual objective, more optimization may be possible.

This does not preserve original order of array items. See:

Related:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228