1

I have 2 tables, df1, and df2. I would like to join the two tables and then perform a GROUP BY aggregation. I keep getting an error that says the columns are not contained in the aggregate function.

DATA:

df1

host    purpose verge   name    date
123     sub     hello   hello1  1/1/2021
123     sub     peace   peace1  1/1/2021
123     sub     love    love1   1/1/2021
123     sub     hey     hey1    1/1/2021
455     dem     no      no1     1/1/2021
455     dem     corp    corp1   1/1/2021

df2

pc  cap free
123 10  2
123 20  1
123 10  2
123 5   1
455 5   1
455 4   3

DESIRED

host    date        pc  cap free purpose 
123     1/1/2021    123 45  6    sub     
455     1/1/2021    455 9   4    dem

DOING

SELECT df1.host, 
       df1.date, 
       df2.sum(cap),
       df2.sum(free), 
       df1.purpose,
       df2.pc
FROM   df1
       JOIN df2 ON
       df1.host = df2.pc
       GROUP BY df1.purpose

I get this error:

column is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause

However, I have aggregated the two columns cap and free. I am still researching this. Any suggestion is appreciated

Dale K
  • 25,246
  • 15
  • 42
  • 71
Lynn
  • 4,292
  • 5
  • 21
  • 44
  • 2
    In SQL Server, every column you select must either be included in the group by **OR** aggregated. You are aggregating 2 columns, grouping by 1 column, which leaves 3 columns which don't meet this criteria. Potentially just add those 3 columns to your group by. – Dale K Jan 19 '21 at 20:09
  • Ok I see. I will try this now. Thank you – Lynn Jan 19 '21 at 20:13
  • Does this answer your question? [GROUP BY / aggregate function confusion in SQL](https://stackoverflow.com/questions/4611897/group-by-aggregate-function-confusion-in-sql) – Dale K Jan 19 '21 at 20:39

1 Answers1

2

You could try this instead if you prefer to see all the original data, but you do need to group the non-aggregated columns otherwise:

SELECT df1.host, 
       df1.date, 
       Sum(df2.cap) over (partition by df1.purpose),
       Sum(df2.free) over (partition by df1.purpose),
       df1.purpose,
       df2.pc
FROM   df1
JOIN df2 ON
       df1.host = df2.pc

https://learn.microsoft.com/en-us/sql/t-sql/queries/select-over-clause-transact-sql?view=sql-server-ver15

Dale K
  • 25,246
  • 15
  • 42
  • 71
Johnny Fitz
  • 532
  • 2
  • 8