1

I have the below query that was created to show the summation of the "Last" values for a year, usually this is a december value, but the year could potentially end in any month so i want to add together the last values for each goalmontecarloheaderid. I have it working 99%, but there are some random duplicates in the [year] value.

WITH endBalances AS (
        SELECT ROW_NUMBER() OVER (PARTITION By GoalMonteCarloHeaderID, Year(Convert(date,MonthDate)) Order By Max(Month(Convert(date,MonthDate))) desc) n, Max(Month(Convert(date,MonthDate))) maxMonth, GrowthBucket, WithdrawalBucket, NoTaxesBucket,
        Year(MonthDate) [year]
        From GoalMonteCarloMedianResults mcmr
        full join GoalMonteCarloHeader mch on mch.ID = mcmr.GoalMonteCarloHeaderID
        full join GoalChartData gcd on gcd.ID = mch.GoalChartDataID and gcd.TypeID = 2
        inner join Goal g on g.iGoalID = gcd.GoalID
        where g.iTypeID in (1) and g.iHHID = 850802
        group by GoalMonteCarloHeaderID, MonthDate, GrowthBucket, WithdrawalBucket, NoTaxesBucket
)
SELECT [year], Sum(GrowthBucket) GrowthBucket, Sum(WithdrawalBucket) WithdrawalBucket,Sum(NoTaxesBucket) NoTaxesBucket, maxMonth
From endBalances 
where [year] is not null and n=1
Group By [year], maxMonth
order by [year] asc

Showing two random duplicates in the database result; enter image description here

you can see in the image there are two examples where the year is duplicated and displayed for more than just the 'last' month in the year. Am I doing something wrong with the group by or the PARTITION BY() in my query? I am not the most familiar with this functionality of T-SQL.

Nick G
  • 1,209
  • 8
  • 34
  • 58

1 Answers1

3

T-SQL has a lovely function for this which has no direct equivalent in MySQL.

ROW_NUMBER() OVER (PARTITION BY [year] ORDER BY MonthDate DESC) AS rn

Then anything with rn=1 will be the last entry in a year.

The answers to this question have a few ideas:
ROW_NUMBER() in MySQL

Community
  • 1
  • 1
  • the only issue with that is that i am doing a summation in my dataset, sometimes there will be 2 or more december values for different 'goals' and in that case it will only take the last goal value not the summation of all three, but thank you for the idea i am implementing that exact logic in a different area. – Nick G Oct 22 '14 at 16:28
  • 1
    Then partition by both [year] and GoalID, like you now have in your original post. That should work great. – Einstein X. Mystery Oct 22 '14 at 17:03
  • I have re-worked my query and updated the post above, I am now receiving random duplicates for the years, i have also tried partitioning by the iGoalID from dbo.Goal table and receive the same results, do you know why I might be getting these dupes? Thank you for all your help in this! – Nick G Oct 27 '14 at 13:12
  • i figured it out, it was because of my grouping by the year and maxMonth columns, removed the maxMonth grouping and everything looks good! – Nick G Oct 27 '14 at 15:15