Demo Unclear Expected Result
I think this is close.
- note baseData is a Common table expression just mocking up sample data
- it can be omitted and the next lines adjusted to reference your tablename.
- I'm not exactly sure what output you're after
- how to handle the conditions such as my "2" which when combined with another runner will cause multiple users to be "valid"
- If it's Runner 1+ Runner 2 value > 0 or Runner 1 > 0 or Runner 2 > 0...
- if the former need more test data staged to sample edge cases.
Cases I'm not sure how to handle
- If a runner by themselves fulfills the requirement their existence means all combinations of "2" runners involving that runner would also meet the requirement
- if two runners together meet the requirement of value > 0 for all years... then the inverse pair would as well (1,2), (2,1) I think I can eliminate this by ensuring matches only occur when Runner1<=Runner2 (Commented out below)
.
with BaseData as (
SELECT 1 Runner, 2015 YR, -8.29 value UNION ALL
SELECT 1 Runner, 2016 YR, -8.29 value UNION ALL
SELECT 1 Runner, 2017 YR, -8.29 value UNION ALL
SELECT 1 Runner, 2018 YR, -8.29 value UNION ALL
SELECT 1 Runner, 2019 YR, -8.29 value UNION ALL
SELECT 2 Runner, 2015 YR, 1 value UNION ALL
SELECT 2 Runner, 2016 YR, 1 value UNION ALL
SELECT 2 Runner, 2017 YR, 1 value UNION ALL
SELECT 2 Runner, 2018 YR, 1 value UNION ALL
SELECT 2 Runner, 2019 YR, 1 value UNION ALL
SELECT 3 Runner, 2015 YR, 1 value UNION ALL
SELECT 3 Runner, 2016 YR, -1 value UNION ALL
SELECT 3 Runner, 2017 YR, -1 value UNION ALL
SELECT 3 Runner, 2018 YR, -1 value UNION ALL
SELECT 3 Runner, 2019 YR, -1 value UNION ALL
SELECT 4 Runner, 2015 YR, -1 value UNION ALL
SELECT 4 Runner, 2016 YR, 1 value UNION ALL
SELECT 4 Runner, 2017 YR, 1 value UNION ALL
SELECT 4 Runner, 2018 YR, 1 value UNION ALL
SELECT 4 Runner, 2019 YR, 1 value),
CTE as (
SELECT A.Runner RUN1, B.Runner RUN2,
sum(case when A.value>0 and A.Runner = B.Runner then 1 else 0 end) as isValidYear,
count(A.YR) TotalYears,
sum(case when A.Runner=B.Runner then A.value else 0 end) as TotalValue,
/*not sure if this should be (A.Value > 0 or B.Value > 0) OR (A.Value+B.Value>0)*/
sum(case when A.runner<> B.Runner and (A.Value>0 OR B.Value > 0) then 1 end) as isValid2PersonYR
FROM BaseData A
CROSS JOIN BaseData B
WHERE A.YR=B.YR
/* AND A.Runner<=B.Runner*/
GROUP BY A.Runner, B.Runner
Order by A.Runner)
SELECT RUN1, Run2, sum(isValidYear) SumIsvalidYear, Sum(isValid2PersonYR) SumIsvalid2PersonYr, TotalYears
FROM CTE
GROUP BY Run1, Run2, TotalYears
HAVING SumisValidYear = Totalyears OR SumisValid2personYR=TotalYears
ORDER BY SumIsValid2PersonYr, Run1
Giving us:
Note the 1st line the runner 1 and 2 is the same, there is no 2nd person, so they did it on their own.
+------+------+----------------+---------------------+------------+
| RUN1 | Run2 | SumIsvalidYear | SumIsvalid2PersonYr | TotalYears |
+------+------+----------------+---------------------+------------+
| 2 | 2 | 5 | | 5 |
| 1 | 2 | 0 | 5 | 5 |
| 2 | 1 | 0 | 5 | 5 |
| 2 | 3 | 0 | 5 | 5 |
| 2 | 4 | 0 | 5 | 5 |
| 3 | 2 | 0 | 5 | 5 |
| 3 | 4 | 0 | 5 | 5 |
| 4 | 2 | 0 | 5 | 5 |
| 4 | 3 | 0 | 5 | 5 |
+------+------+----------------+---------------------+------------+
Yes the combination 4,2 and 2,4 are the same, and we could eliminate them as needed: I think this gets you close but I need to understand how to handle situation for runner 2 or if we have a situation where multiple runners would be returned as they "complete" a runner's series, how do you want those?
We could use this result set to exclude all occurrences of a 2 except that of RUN1: 2 matching Run2:2 which would give us just 3,4 and 4,3 which do infect complete each other.
You can see this is "close" but not understanding how you want to handle these situations I stopped here.
If we include the commented out line in the SQL we get:
+------+------+----------------+---------------------+------------+
| RUN1 | Run2 | SumIsvalidYear | SumIsvalid2PersonYr | TotalYears |
+------+------+----------------+---------------------+------------+
| 2 | 2 | 5 | | 5 |
| 1 | 2 | 0 | 5 | 5 |
| 2 | 3 | 0 | 5 | 5 |
| 2 | 4 | 0 | 5 | 5 |
| 3 | 4 | 0 | 5 | 5 |
+------+------+----------------+---------------------+------------+
and these results make sense to me. the questions asked at the top of this response apply.
- should (1,2),(2,3),(2,4) be in the results? (they only exist because (2) is complete by itself)
- should results just be (2,2) since we had a runner meet the condition?
- should results be (2,2) and (3,4) since you want to see all that met the conditions?