0

I am trying to inner join two tables. First table can have two records so performing group by on that. Error:

column "b.db" must appear in the GROUP BY clause or be used in an aggregate function

select distinct a.schemaname,a.tablename,max(a.tbl_cmmt_txt) as 
tbl_cmmt_txt,max(a.cleansed_tbl_cmmt_txt) as cleansed_tbl_cmmt_txt,max(a.tbl_dscr_txt) 
as tbl_dscr_txt,max(a.cleansed_tbl_dscr_txt)as cleansed_tbl_dscr_txt,max(a.tbl_annt_txt) 
as tbl_annt_txt,max(a.cleansed_tbl_annt_txt) as cleansed_tbl_annt_txt,max(a.dw_etl_sess_nm),max(a.dw_ld_grp_val),max(a.dw_ins_dts),max(a.dw_upd_dts)
,b.db,b.schemaname,b.tablename,b.tabledescription,b.table_id,b.ts_updated,b.user_name
 from test.sample a INNER JOIN  test.sample1 b
 ON
 lower(a.schemaname)=lower(b.schemaname)
 AND lower(a.tablename)=lower(b.tablename)
 where a.schemaname is not null
 group by a.schemaname,a.tablename
user4157124
  • 2,809
  • 13
  • 27
  • 42
Sak
  • 61
  • 6
  • 1
    You typically `GROUP BY` the same columns as you `SELECT`, _except those who are arguments to aggregate functions_. – jarlh Jun 20 '23 at 12:25
  • Try to GROUP BY/aggregate in a subqeury - before joining b. – jarlh Jun 20 '23 at 12:27

1 Answers1

1

Aggregate first, join later:

SELECT a.*
     , b.db, b.schemaname, b.tablename, b.tabledescription, b.table_id, b.ts_updated, b.user_name
FROM  (
   SELECT schemaname, tablename
        , max(tbl_cmmt_txt) AS tbl_cmmt_txt, max(cleansed_tbl_cmmt_txt) AS cleansed_tbl_cmmt_txt, ...  -- more
   FROM   test.sample a
   WHERE  schemaname IS NOT NULL
   GROUP  BY schemaname, tablename
   ) a
JOIN   test.sample1 b ON lower(a.schemaname) = lower(b.schemaname)
                     AND lower(a.tablename) = lower(b.tablename);

See:

I removed DISTINCT on a hunch. It's unclear if table sample1 can hold duplicate matches and what you want to pick in this case.

You might want LEFT JOIN instead, depending on your undisclosed objective.

Also, lower(tablename) and lower(schemaname) are suspicious. Postgres identifiers are case-sensitive when double-quoted. See:

And it reeks that you aggregate on schemaname ... but join on lower(schemaname) .... Typically, you'd want the same for both.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228