EDIT: Edited to include input as table instead of image and expected output as a table.
I'm using SQL Server 2017. I have a dataset with fitness data, with each row corresponding to one set of a given exercise on a given date (so each exercise appears in multiple rows on the same date but also in different dates). Each row also contains the number of kilograms used in that particular set, as shown here:
Date | Exercise | Category | Weight_kg | Reps |
---|---|---|---|---|
2022-04-18 00:00:00.000 | Lat Pulldown | Back | 35 | 5 |
2022-02-12 00:00:00.000 | Lat Pulldown | Back | 30 | 5 |
2022-04-18 00:00:00.000 | Lat Pulldown | Back | 35 | 5 |
2022-04-18 00:00:00.000 | Lat Pulldown | Back | 35 | 5 |
2022-03-07 00:00:00.000 | Lat Pulldown | Back | 25 | 8 |
2022-02-12 00:00:00.000 | Upright Row | Shoulders | 15 | 6 |
2022-02-12 00:00:00.000 | Upright Row | Shoulders | 10 | 6 |
2022-03-07 00:00:00.000 | Upright Row | Shoulders | 10 | 6 |
2022-03-07 00:00:00.000 | Upright Row | Shoulders | 5 | 7 |
2022-04-18 00:00:00.000 | Upright Row | Shoulders | 17 | 7 |
2022-02-16 00:00:00.000 | Glute Bridge | Glutes | 50 | 20 |
2022-02-16 00:00:00.000 | Glute Bridge | Glutes | 65 | 10 |
2022-03-07 00:00:00.000 | Glute Bridge | Glutes | 75 | 6 |
2022-04-18 00:00:00.000 | Glute Bridge | Glutes | 75 | 6 |
2022-03-07 00:00:00.000 | Glute Bridge | Glutes | 75 | 6 |
2022-04-18 00:00:00.000 | Glute Bridge | Glutes | 75 | 6 |
2022-02-16 00:00:00.000 | Hip Thrust | Legs | 48 | 12 |
2022-02-16 00:00:00.000 | Hip Thrust | Legs | 68 | 8 |
2022-03-07 00:00:00.000 | Hip Thrust | Legs | 78 | 7 |
2022-04-18 00:00:00.000 | Hip Thrust | Legs | 68 | 8 |
2022-03-07 00:00:00.000 | Hip Thrust | Legs | 80 | 6 |
I calculated the maximum value of kg achieved for each exercise:
SELECT Exercise, MAX(Weight_kg) AS personal_record
FROM table..fitness
-- Filter out data from before 2022 and bodyweight/no weighted exercises:
WHERE Date >= '2022-01-01' AND Weight_kg > 1
GROUP BY Exercise
ORDER BY personal_record DESC
However, I want to know the exact Date in which that personal_record was achieved for the first time. When I run the following:
SELECT Date, Exercise, MAX(Weight_kg) AS personal_record
FROM table..fitness
-- Filter out data from before 2022 and bodyweight/no weighted exercises:
WHERE Date >= '2022-01-01' AND Weight_kg > 1
GROUP BY Exercise, Date
ORDER BY personal_record DESC
The output is a row for each Date in which each Exercise was performed, so it does not limit it to the first time. Below is the expected output:
Date | Exercise | personal_record |
---|---|---|
2022-04-18 00:00:00.000 | Lat Pulldown | 35 |
2022-04-18 00:00:00.000 | Upright Row | 17 |
2022-04-18 00:00:00.000 | Glute Bridge | 75 |
2022-03-07 00:00:00.000 | Hip Thrust | 80 |
I've tried several things:
- Partitioning by Exercise and sorting the partition from the earliest Date:
SELECT Date, Exercise,
MAX(Weight_kg) OVER (PARTITION BY Exercise ORDER BY Date) AS personal_record
FROM table..fitness
-- Filter out data from before 2022 and bodyweight/no weighted exercises:
WHERE Date >= '2022-01-01' AND Weight_kg > 1
GROUP BY Exercise, Date
ORDER BY personal_record DESC
This gives the following error: Column 'table..fitness.Weight_kg' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. Adding Weight_kg to GROUP BY generates the same output of a row for each Date of each Exercise, so again it does not limit it to the first Date of ocurrence.
- Calculating the minimum Date value:
SELECT MIN(Date) AS first_time,
Exercise,
MAX(Weight_kg) AS personal_record
FOM table..fitness
-- Filter out data from before 2022 and bodyweight/no weighted exercises:
WHERE Date >= '2022-01-01' AND Weight_kg > 1
GROUP BY Exercise
ORDER BY personal_record DESC
The output here is closer to what I need: only one row for each exercise with one date. However, the Date that appears does not correspond to the first date of MAX(Weight_kg), but to the first Date in which the Exercise was performed.
- ROW_NUMBER() as suggested in this question:
SELECT Date, Exercise, personal_record
FROM
(
SELECT *,
MAX(Weight_kg) AS personal_record,
ROW_NUMBER() OVER (PARTITION BY Exercise ORDER BY Date) AS rn
FROM table..fitness
-- Filter out data from before 2022 and bodyweight/no weighted exercises:
WHERE Date >= '2022-01-01' AND Weight_kg > 1) AS partitioned_PR
WHERE rn = 1
GROUP BY Date, Exercise
This gives the error: Column 'MyYear..fitness.Date' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. Which doesn't make sense to me, since I actually added it in the GROUP BY clause.
- Creating temporary/virtual tables and manipulate them further: I get stuck at the same stage, when I have to include Date in the GROUP BY clause in order to be able to include Date in SELECT.
Any ideas on how to solve this? Quite stuck at this point. Thanks