2

I am struggling to write a query that returns the SUM of values in one field. I need to SUM it just once if there are multiple rows with the same value in another 2 fields and to SUM all values if there are

I have a table with 3 fields:

  1. UTM Source
  2. Company ID
  3. Active users

If the UTM Source and Company ID is the SAME I need to count distinct values of Active users If the UTM Source and Company ID is DIFFERENT I need to count all values of Active users

Here is the sample table: https://docs.google.com/spreadsheets/d/1jytu0_YduNlFXAgs_kVZ8M3UbPKnKVQWgXRE6r10p6I/edit?usp=sharing

I managed to write this query:

SELECT  DISTINCT `UTM Source`, `Company ID`, SUM(DISTINCT `Active users`) as Users
FROM customers
GROUP BY `UTM Source`, `Company ID`) AS customers_2`

I get almost what I need since it returns just one row for UTM Source email with value 9 in the field of Active users. Now I need just to summarize all the same UTM Source so I tried this query:

SELECT `UTM Source`, `Users`
FROM (
     SELECT  DISTINCT `UTM Source`, `Company ID`, SUM(DISTINCT `Active users`) as Users
     FROM customers
     GROUP BY `UTM Source`, `Company ID`) AS customers_2
GROUP BY `UTM Source`

But I am receiving some kind of error:

Expression #2 of the SELECT list is not in the GROUP BY clause and contains nonaggregated column 'customers_2.Users' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by

I've checked that thread, but I do not understand it well since I am new to databases. I am using MAMP, MySQL 5.27.6 on MAC with Sequel Pro and this is just a testing database I created from the XLS table I received to solve this task.

Thanks for any advice.

Ahmad
  • 1,618
  • 5
  • 24
  • 46

2 Answers2

1

Group by is for agggreated result so in your case or you just need the subquery result then you don't nedd group by

SELECT `UTM Source`, `Users`
FROM (
     SELECT  DISTINCT `UTM Source`, `Company ID`, SUM(DISTINCT `Active users`) as Users
     FROM customers
     GROUP BY `UTM Source`, `Company ID`)  customers_2

or you should add an aggregation function eg:

SELECT `UTM Source`, sum(`Users`)
FROM (
     SELECT  DISTINCT `UTM Source`, `Company ID`, SUM(DISTINCT `Active users`) as Users
     FROM customers
     GROUP BY `UTM Source`, `Company ID`)  customers_2
GROUP BY `UTM Source`

or ypu must mention in group y the column not involved by aggregation function

SELECT `UTM Source`, `Users` , count(*)
FROM (
     SELECT  DISTINCT `UTM Source`, `Company ID`, SUM(DISTINCT `Active users`) as Users
     FROM customers
     GROUP BY `UTM Source`, `Company ID`)  customers_2
GROUP BY `UTM Source`,  `Users`
ScaisEdge
  • 131,976
  • 10
  • 91
  • 107
0

Without seeing what results you actually want to produce from that data, your question is currently rather ambiguous.

My personal interpretation is that you want the sum of the active users column per UTM Source. Except, where multiple rows exactly match they should only be included once.

In reverse, that's two steps...

  • De-duplicate the data
  • Aggregate the data

Which gives me the following...

SELECT
  `UTM Source`,
  SUM(`Active Users`)   AS Users
FROM
(
  SELECT DISTINCT
    `UTM Source`,
    `Company ID`,
    `Active Users`
  FROM
    customers
)
  deduplicated_data
GROUP BY
  `UTM Source`

EDIT:

I do wonder if the problem statement is missing some information though? Why would the same UTM Source have the same Company ID with multiple different numbers of Active Users?

  • Is there some other column(s) not being mentioned?

For example, perhaps there's a date field? In which case you possibly actually need to find the record with the most recent date?

More information would be helpful (edit your question), about the reason you need to do this / what you are trying to achieve, and the exact results you want to generate from your given example data.

(Also, please don't link to Google Docs, include all the data as text in your question, Stack Overflow has suitable text formatting in the editor and means that question won't 'break' if you later delete that sheet.)

MatBailie
  • 83,401
  • 18
  • 103
  • 137
  • sorry, this is my first question and I was struggling to format the example data so I link it to the spreadsheet...I have to learn it... ...I edited the desired result ...yes there are some other columns in the database, the unique is for example USER ID...my goal is to see which source generated the most of the active users – Michal Stin Jul 18 '20 at 08:43