0

I am using the Sakila Database for practice.

I am calculating - Sales by Store. It provides a list of total sales, broken down by store. It incorporates data from the city, country, payment, rental, inventory, store, address, and staff tables

I have written a query, but it is throwing an error due to not using the non-aggregated column in the GROUP BY Clause. However, the View DDL statement also does something similar but does not give any error. If anyone can explain to me why is this happening - it would be very useful.

DDL Statement for the View (Which is working fine):

SELECT 
    CONCAT(`c`.`city`, _utf8mb4 ',', `cy`.`country`) AS `store`,
    CONCAT(`m`.`first_name`, _utf8mb4 ' ', `m`.`last_name`) AS `manager`,
    SUM(`p`.`amount`) AS `total_sales` 
FROM 
    (((((((`sakila`.`payment` `p` 
    INNER JOIN `sakila`.`rental` `r` ON (`p`.`rental_id` = `r`.`rental_id`)
    ) 
    INNER JOIN `sakila`.`inventory` `i` ON (`r`.`inventory_id` = `i`.`inventory_id`)
    ) 
    INNER JOIN `sakila`.`store` `s` ON (`i`.`store_id` = `s`.`store_id`)
    ) 
    INNER JOIN `sakila`.`address` `a` ON (`s`.`address_id` = `a`.`address_id`)
    ) 
    INNER JOIN `sakila`.`city` `c` ON (`a`.`city_id` = `c`.`city_id`)
    ) 
    INNER JOIN `sakila`.`country` `cy` ON (`c`.`country_id` = `cy`.`country_id`)
    ) 
    INNER JOIN `sakila`.`staff` `m` ON (`s`.`manager_staff_id` = `m`.`staff_id`)
)
GROUP BY `s`.`store_id`
ORDER BY `cy`.`country`, `c`.`city`;

My Query -

SELECT 
    cont.country, 
    SUM(pay.amount) as "Total Sales"
FROM payment as pay
INNER JOIN staff as staff ON pay.staff_id = staff.staff_id 
INNER JOIN store as store ON staff.store_id = store.store_id
INNER JOIN address as ad ON store.address_id = ad.address_id
INNER JOIN city as City ON ad.city_id = city.city_id
INNER JOIN country as cont ON city.country_id = cont.country_id
INNER JOIN rental as rent ON pay.rental_id = rent.rental_id
INNER JOIN inventory as inven ON rent.inventory_id = inven.inventory_id
GROUP BY inven.store_id;

The Output desired is revenue by country, but when I do that the output is coming wrong, we have to group by store_ID as in DDL Statement

Desired output

  • You should be grouping by pay.amount since your aggregate function is summing by pay.amount, see my modification on your code below. SELECT cont.country, SUM(pay.amount) as "Total Sales" FROM payment as pay ... your joins.. GROUP BY pay.amount – Andile Jul 25 '23 at 10:18

2 Answers2

0

In some DBs, there is rule in Group By clause which is, the columns added in GROUP BY clause should be included in SELECT clause. Try this

SELECT 
    inven.store_id,
    cont.country,
    SUM(pay.amount) as "Total Sales"
FROM payment as pay
INNER JOIN staff as staff ON pay.staff_id = staff.staff_id 
INNER JOIN store as store ON staff.store_id = store.store_id
INNER JOIN address as ad ON store.address_id = ad.address_id
INNER JOIN city as City ON ad.city_id = city.city_id
INNER JOIN country as cont ON city.country_id = cont.country_id
INNER JOIN rental as rent ON pay.rental_id = rent.rental_id
INNER JOIN inventory as inven ON rent.inventory_id = inven.inventory_id
GROUP BY inven.store_id, cont.country;
Kirs Sudh
  • 373
  • 2
  • 15
0

There is an easy rule of thumb with GROUP BY:

Always include all non-aggregated columns from SELECT into the GROUP BY.

Your query has the following SELECT statement:

SELECT cont.country, SUM(pay.amount) as "Total Sales"

This indicates that you want to have the sum of pay.amount for each cont.country. The pay.amount is inside an aggregate function, while cont.country is not. Therefore, the correct GROUP BY clause should be:

GROUP BY cont.country

MySQL also allows you to leave out columns from GROUP BY if they are functionally dependent on other columns in the GROUP BY clause. For example, if you GROUP BY using the country id, you do not need to include the country name even if it is in the SELECT statement.

Different MySQL versions may have slightly different behavior based on the ONLY_FULL_GROUP_BY SQL mode.

For more information, refer to the MySQL documentation.

slaakso
  • 8,331
  • 2
  • 16
  • 27
  • Can you please explain to me why the DDL code which I have shared above is working fine even though it is not following the thumb rule of the GROUP BY clause? – Apoorvaa Singh Jul 25 '23 at 14:08
  • It works because the `city`, `country` and `staff` used in the `SELECT` are functionally dependent on the `store_id` (`store`->`address`->`city`->`country` and `store`->`staff`). If you do not want to repeat the lengthy `CONCAT(...` in the `GROUP BY`, MySQL also allows grouping by column name (`GROUP BY store, manager` or by column position (`GROUP BY 1, 2`), – slaakso Jul 26 '23 at 17:47