Below is my attempt to see pitchers in the top 98th percentile of innings pitched. I am trying to filter the NTILE() window function but it's throwing an error.
query = """
WITH pitching_cte AS
(
SELECT
player,
player_id,
season,
CAST(strftime('%Y', birth_date) AS TEXT) AS birth_year,
COUNT(*) AS num_seasons,
printf("%i.%i", outs/3, outs % 3) AS innings_pitched
FROM pitching
INNER JOIN player
USING(player_id)
GROUP BY player
)
SELECT innings_pitched,
birth_year,
player,
NTILE(100) OVER(PARTITION BY innings_pitched) pctl,
num_seasons
FROM pitching_cte
WHERE season = 2021 AND pctl >= 98;
"""
df = pd.read_sql(query, cnxn)
print(df.head(10))
When I execute this query I get the following error:
DatabaseError: Execution failed on sql '
WITH pitching_cte AS
(
SELECT
player,
player_id,
season,
CAST(strftime('%Y', birth_date) AS TEXT) AS birth_year,
COUNT(*) AS num_seasons,
printf("%i.%i", outs/3, outs % 3) AS innings_pitched
FROM pitching
INNER JOIN player
USING(player_id)
GROUP BY player)
SELECT innings_pitched,
birth_year,
player,
NTILE(100) OVER(PARTITION BY innings_pitched) pctl,
num_seasons
FROM pitching_cte
WHERE season = 2021 AND pctl >= 98;
': misuse of aliased window function pctl
How do I filter the resulting percentile column?
Thanks