0

I am trying to find data related to the maximum data for a distinct combination of data, not the max date for all the data. I searched for similar problems but it seems my problem is different. I checked

How can I SELECT rows with MAX(Column value), DISTINCT by MULTIPLE columns in SQL Get Max Date - For Every Transaction

Here is my code that returns all data:

select top.name as [familyName], s.sportName, p.position, bottom.firstName as [firstName], a.awardAmount, a.awardDate
from familyRelation r
inner join family bottom on r.firstNameID = bottom.NameID  
inner join family top on r.familyID = top.NameID
inner join familyPosition p on p.NameID = bottom.NameID
inner join sport s on s.sportID = p.sportID 
left join awards a on a.awardID = p.familyPositionID 


familyName     sport     position   firstName   award  awardDate

Smith         tennis     umpire      mary       null     1/1/2011       
Smith         tennis     umpire      mary       100     10/20/2011       
Smith         swim       diver       mary       null     1/1/2019       
Smith         diving     diver       susan      300     1/1/2011       
Jones         rugby      player      henry      100     1/1/2016       
Jones         rugby      player      henry      150     1/1/2011       
Jones         rugby      forward     henry      190     1/1/2008       
Jones         rugby      forward     henry      100     1/1/2011
Adams         hockey     goalkeeper  grant      null     1/1/2011       
Adams         hockey     goalkeeper  grant      null     12/12/2018       
Adams         hockey     goalkeeper  grant      null     1/1/2011       
Adams         hockey     goalkeeper  grant      5750     1/1/2011

but I need to return the data related to the maximum date for the (family, sport, position, firstName) combination as shown below:

familyName    sport     position   firstName   award   awardDate
Smith         tennis     umpire      mary       100     10/20/2011       
Smith         swim       diver       mary       150     1/1/2019       
Smith         diving     diver       susan      300     1/1/2011       
Jones         rugby      player      henry      190     1/1/2016       
Jones         rugby      forward     henry      100     1/1/2011
Adams         hockey     goalkeeper  grant      null    12/12/2018 

These are the relations between the family and firstname tables familyRelation table

familyId  firstNameID
100          10
100          20
200          30
300          40

family table

NameID     name
10         mary
20         susan 
30         henry
40         grant
100        smith
200        jones
300        adams

I tried this but it uses the same max date for all rows which is not what I want because different combinations of (family, sport, position, firstName) have different max dates:

select top.name as [familyName], s.sportName, p.position, bottom.firstName as [firstName], a.awardAmount, a.awardDate
from familyRelation r
inner join family bottom on r.firstNameID = bottom.NameID  
inner join family top on r.familyID = top.NameID
inner join familyPosition p on p.NameID = bottom.NameID
inner join sport s on s.sportID = p.sportID 
left join ( select awardAmount, max(awardDate) 
            from awards
            group by awardAmount) a
            on a.aID = p.p.familyPositionID

Thank you for all your help. I can't seem to figure this out.

update: I tried to use row_number() but the code is still not returning the results I need. I need to include a.awardAmount but once I add it, there is no maximum date because award amounts are distinct

select top.name as [familyName], s.sportName, p.position, bottom.firstName as [firstName], a.awardAmount, a.awardDate
, row_number() over partition by top.name, s.sportName, p.position, bottom.firstName
order by top.name, s.sportName, p.position, bottom.firstName)  
from familyRelation r
inner join family bottom on r.firstNameID = bottom.NameID  
inner join family top on r.familyID = top.NameID
inner join familyPosition p on p.NameID = bottom.NameID
inner join sport s on s.sportID = p.sportID 
left join awards a on a.awardID = p.familyPositionID 
ithoughtso
  • 103
  • 8
  • 1
    https://stackoverflow.com/questions/176964/select-top-10-records-for-each-category – S3S Jun 07 '19 at 18:29
  • That code is from 11 years ago but I will try it. Thank you. Administrator please don;t close this question just in case the link does not provide the solution I need and other people have suggestions. Thank you – ithoughtso Jun 07 '19 at 19:03
  • Your question is "Get top N by group" from how i read it. There are a lot of examples using `row_number()` and derived tables via `max()` as you have done. – S3S Jun 07 '19 at 19:07
  • I tried the code in the link you sent me. Most of the code did not work because there were no joins and my code needs several joins. There was one block of code where there was a left join but that code is what I have been trying but there is an issue with the group by which I listed in another response below. But thank you for your suggestion – ithoughtso Jun 07 '19 at 23:54

1 Answers1

1

Sounds like your looking for a simple group by -

SELECT
      top.name as [familyName], 
      s.sportName,
      p.position, 
      bottom.firstName as [firstName], 
      a.awardAmount, 
      MAX(a.awardDate) as maxDatePerGroup
from familyRelation r
    inner join family bottom on r.firstNameID = bottom.NameID  
    inner join family top on r.familyID = top.NameID
    inner join familyPosition p on p.NameID = bottom.NameID
    inner join sport s on s.sportID = p.sportID 
    left join awards a on a.awardID = p.familyPositionID 
GROUP BY top.name, s.sportName, p.position,bottom.firstName, a.awardAmount
FizzBuzz
  • 683
  • 3
  • 8
  • thank you for your suggestion but it seems to be giving all rows the same max date. – ithoughtso Jun 07 '19 at 22:22
  • i see the problem: if I use your code but remove the a.awardAmount from the select statement, I see the max date for each combination and it is correct, but once I add a.awardAmount, all the dates appear insted of only showing the max date because the a.awardAmount is included in the group by statement. I need the award amount but not in the group by part but sql won't let me run a max() aggregate without it being there. this is the problem. – ithoughtso Jun 07 '19 at 23:41
  • I found a way to resolve this but the solution is not elegant. I store the information with all the data in one temporary table and the data with the max values in another then do a left join. I saw this: https://stackoverflow.com/questions/11991079/select-a-column-in-sql-not-in-group-by which said sql server does not allow it, you need a work around. – ithoughtso Jun 08 '19 at 03:00
  • thank you both for your contributions. Now I have a solution, inelegant as it may be but your contributions led me to the path of finding some solution. – ithoughtso Jun 08 '19 at 04:11