0

I've just installed MySQL, I'm interested in learning SQL. One of the default database of MySQL is "World". This database has 5 columns: ID, name (city name), CountryCode, District and Population. I tried a simple query that should have showed the ID of the city with the lowest population per country.

USE world;

SELECT ID, MIN(POPULATION), COUNTRYCODE
FROM CITY
GROUP BY COUNTRYCODE
order by ID

The problem is that the result of the query shows the first ID of the country, and its not related to the city's ID with the lowest population of the country.

Exemple: AFG countrycode has 4 cities:

+----+----------------+-------------+------------+------------+
| ID |      CITY      | COUNTRYCODE | CITY_AGAIN | POPULATION |
+----+----------------+-------------+------------+------------+
|  1 | Kabul          | AFG         | Kabol      |    1780000 |
|  2 | Qandahar       | AFG         | Qandahar   |     237500 |
|  3 | Herat          | AFG         | Herat      |     186800 |
|  4 | Mazar-e-Sharif | AFG         | Balkh      |     127800 |
+----+----------------+-------------+------------+------------+

The result of the query for AFG country is

+----+------------+-------------+
| ID | POPULATION | COUNTRYCODE |
+----+------------+-------------+
|  1 |     127800 | AFG         |
+----+------------+-------------+

The ID was supposed to be 4, not 1.

So, why the result of the query is not correct on ID column?

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
lipao255
  • 51
  • 1
  • 1
  • 4

2 Answers2

0

You're grouping by COUNTRYCODE, so all 5 rows are grouped into one.

You've used a grouping function for the POPULATION, so that's correct.

The COUNTRYCODE is always the same, so it is always correct.

However, the query has no instructions on which ID to pick. You haven't specified a grouping function.

What to do?

I don't understand your reason to want to order the results by ID. It would make more sense to sort by MIN(POPULATION) or COUNTRYCODE. For instance:

SELECT ID, 
       MIN(POPULATION) AS SmallestCityPopulation, 
       COUNTRYCODE
FROM CITY
GROUP BY COUNTRYCODE
order by SmallestCityPopulation ASC

This would make sense.

KIKO Software
  • 15,283
  • 3
  • 18
  • 33
0

What your query does is:

  1. You select all rows from the city table.
  2. You aggregate those rows to get one result per country.
  3. You show the ID. The ID??? There are many cities in a country, hence many IDs. Which do you want to show? The minimum? The maximum? You don't tell the DBMS which. This should result in an error (and it would had you set sql mode to ONLY_FULL_GROUP_BY). As is, MySQL silently applies ANY_VALUE(id), i.e. it picks an ID arbitrarily which is not what you want.

You cannot do what you want in one step. You need two steps:

  1. Find the minimum poplulation per country.
  2. Find the cities matching that population.

One (simple) solution:

SELECT *
FROM city
WHERE (countrycode, population) IN
(
  SELECT countrycode, MIN(population)
  FROM city
  GROUP BY countrycode
);

Another solution follows a different path: Select all rows for which not exists a city with a lower population for the country:

SELECT *
FROM city
WHERE NOT EXISTS
(
  SELECT NULL
  FROM city smaller_city
  WHERE smaller_city.countrycode = city.countrycode
  AND smaller_city.population < city.population
);
Thorsten Kettner
  • 89,309
  • 7
  • 49
  • 73
  • Thank you. But what if 2 cities has equal population in the same countrycode (Yes, I knows this is almost impossible, but we never know) ? Is there any way my result show both cities? – lipao255 Mar 16 '21 at 23:35
  • Both my queries do exactly that. Study the queries and you'll see it :-) – Thorsten Kettner Mar 16 '21 at 23:38