-2

I want to find one city for each state that has the maximum average population? I have three table state, city and population. The table provided are simplified, to only have 2 states

"state" Code is our key that is unquie for each state

name code
Ohio OH
Wisconsin WI

"city"

two state codes to refer to border between two states

code name
OH Cevland
OH Dayton
OH Toledo
WI Madison
WI Racine

"citypop"

code name Year pop
OH Cevland 1998 10000
OH Cevland 2000 1000
OH Dayton 1998 6000
OH Toledo 1978 8000
WI Madison 1999 2000
WI Madison 2000 20000
WI Racine 2000 5000

Expected result : Cevland not choose because avgpop, madison and toledo are selected

city avgpop
Toledo 8000
Madison 11000

The query i have made so far.

Select c.name, avg(cp.length)
from city c
Inner Join citypop cp
On c.name = cp.city
Group by c.name

My thinking is I want to select the name and avg, but not sure how to get the next step of only one city for each country.

'Edit' The reason its madison is because we add 2000, 20000 and then devide to get the average. so madision avg pop is 11000 and racine(having only one data point value) is 5000. We want the max average so we select madison.

MT0
  • 143,790
  • 11
  • 59
  • 117
Vin rich
  • 29
  • 5
  • city pop has no column length, i still don't get why madison – nbk Nov 30 '22 at 18:42
  • @nbk, I edited giving reasoning as to why madison is selected. citypop column length can vary, as it depends on its year. some city have more recorded data then others – Vin rich Nov 30 '22 at 18:53

2 Answers2

1

Maybe this could do the job:

--  S a m p l e    D a t a
WITH
    states AS
        (
            Select 'Ohio'       "STATE",    'OH' "CODE" From Dual Union All
            Select 'Wisconsin'  "STATE",    'WI' "CODE" From Dual 
        ),
    cities AS
        (
            Select 'OH' "CODE", 'Clevland' "CITY" From Dual Union All
            Select 'OH' "CODE", 'Dayton'   "CITY" From Dual Union All
            Select 'OH' "CODE", 'Toledo'   "CITY" From Dual Union All
            Select 'WI' "CODE", 'Madison'  "CITY" From Dual Union All
            Select 'WI' "CODE", 'Racine'   "CITY" From Dual 
        ),
    citypops AS
        (
            Select 'OH' "CODE", 'Clevland' "CITY", 1998 "YR", 10000 "POP" From Dual Union All 
            Select 'OH' "CODE", 'Clevland' "CITY", 2000 "YR",  1000 "POP" From Dual Union All
            Select 'OH' "CODE", 'Dayton'   "CITY", 1998 "YR",  6000 "POP" From Dual Union All
            Select 'OH' "CODE", 'Toledo'   "CITY", 1978 "YR",  8000 "POP" From Dual Union All
            Select 'WI' "CODE", 'Madison'  "CITY", 1999 "YR",  2000 "POP" From Dual Union All
            Select 'WI' "CODE", 'Madison'  "CITY", 2000 "YR", 20000 "POP" From Dual Union All
            Select 'WI' "CODE", 'Racine'   "CITY", 2000 "YR",  5000 "POP" From Dual
        ),

Create CTE with averages...

    avgs AS
        (
            Select
                s.CODE "CODE",
                c.CITY "CITY",
                AVG(p.POP) "AVG_POP"
            From
                states s
            Inner Join
                cities c ON(c.CODE = s.CODE)
            Inner Join
                citypops p ON(p.CODE = c.CODE And p.CITY = c.CITY)
            Group By
                s.CODE, c.CITY        
        )

... use avgs CTE both to select the data and to do the filtering using inner join

Select
    a.CITY,
    a.AVG_POP "MAX_AVG_POP"
From
    avgs a
Inner Join
    (
        Select
          CODE,
          CITY,
          MAX(AVG_POP) OVER(PARTITION BY CODE) "MAX_AVG_POP" 
        FROM
          avgs
    ) m ON(a.CODE = m.CODE And m.CITY = a.CITY And a.AVG_POP = m.MAX_AVG_POP)

Result:

CITY MAX_AVG_POP
Toledo 8000
Madison 11000

NOTE:
If there are two cities with the same (max) average of pops within the same state - both will be selected...

d r
  • 3,848
  • 2
  • 4
  • 15
1

With a CTE and a simple Grouping function with ROW_NUMBER, you can achieve

this will find only one City for every state, as you wanted, if there are two with the same average, it will choose by random or you add a second order by which selects the wanted city

WITH CITY_POP AS (
SELECT
 "code", "name", AVG("pop") avgpop
  ,ROW_NUMBER() OVER(PARTITION BY "code" ORDER BY  AVG("pop") DESC) rn
FROM citypop
GROUP BY "code", "name")
SELECT "name", avgpop FROM CITY_POP WHERE rn = 1
name AVGPOP
Toledo 8000
Madison 11000

fiddle

nbk
  • 45,398
  • 8
  • 30
  • 47
  • If there are two cities with same (max) average population in the same state this will select just one. Tiny possibility, but still a possibility... Try to put 8000 population to Dayton... – d r Nov 30 '22 at 19:55
  • i quote *I want to find one city for each state* – nbk Nov 30 '22 at 21:00
  • That is more than clear. It is just about (in case of same result) which one Dayton or Toledo or just randomly. I would just like to know if something like that (almost impossible) happen. – d r Nov 30 '22 at 21:37
  • i clerified, what has to be done when it matters, which of the cities has to be chosen, that is a question only the user can answer – nbk Nov 30 '22 at 21:40
  • Right again - only, he will never know that there has something like that happened - he will get just one (doesn't matter which one). Forget it, it is not important and, probably, quite impossible... – d r Nov 30 '22 at 21:46