I'm having some issues figuring out how GROUP a table in various ways SUMing the numbers of units sold, but only SUMing the most recent on hand units for each item within each GROUP.
Here's an example dataset: http://www.sqlfiddle.com/#!2/3ff18/1
I need to be able to perform GROUP BYs in such a way that the On Hand column is SUMed for only most recent item(s) within each group.
I've had some progress with a "self join" using MAX(date), but I'm not getting the desired results when using various GROUP BYs.
Here's some example outputs I'd like to see given the sqlfiddle.com dataset:
Category Sold On Hand
Electronics 500 0
Books 500 0
Other 0 100
Quarter Category Sold On Hand
Q1 Electronics 400 100
Q1 Books 400 100
Q1 Other 0 100
Q2 Electronics 100 0
Q2 Books 100 0
Q2 Other 0 100
Month Sold On Hand
January 300 800
February 100 700
March 200 500
April 200 300
May 0 300 <- This May entry isn't strickly necessary, but it would be nice
June 100 200
July 100 100 <- This 100 units On Hand is from Item 987 that hasn't been sold
One area where the MAX(date) approach is tripping me up is with GROUP BY month
. If you look in the table just above you'll note that I'd like to see 100 units On Hand in July... which is to say all units have been sold except for the item 987 which was added in January, but has not sold.
A couple of notes:
- This is using MySQL but it'd be willing to try PostgreSQL if it has windowing functions that assist with this.
- The performance of the solution is fairly vital given there are currently 1.5 million records. And will likely see millions more added.