1

I'm trying to build a sql query that will return a list of IDs that have a total sum, which is less than OR greater than a given value using the least number of items.

Here's an example of the table I'll be querying.

ID    Value 
-----------  
226   2.3   
331   3.1   
25    1.5   
28    1.5   
29    1.2   
52    5.2   
38    3.5   

Here it is sorted by Value asc.

ID   Value
----------  
29   1.2  
25   1.5  
28   1.5  
226  2.3  
331  3.1  
38   3.5  
52   5.2  

Example A :

If my value is 6, I would expect the query to return IDs 29, 25, 28 and 226. 1.2 + 1.5 + 1.5 + 2.3 = 6.5

Example B :

If my value is 19, I would expect the query to return all of the IDs (29, 25, 28, 226, 331, 38, 52).
1.2 + 1.5 + 1.5 + 2.3 + 3.1 + 3.5 + 5.2 = 18.3

I've tried the suggested answer found here: SQL select elements where sum of field is less than N

However, that's not giving me exactly what I need since it only returns IDs that add up to LESS than the set value. Also it is assuming that the ID is ascending which isn't the case when I sort by asc value.

Is this even possible within a sql statement? or would I have to do a procedure/function to accomplish this task?

Community
  • 1
  • 1
whoadiz
  • 83
  • 1
  • 5
  • Can you write out the process by which those Examples return those IDs? For Example A, IDs 52 and 29 total 6.4 which is fewer IDs than your answer and closer to your desired total. – MT0 Jan 09 '14 at 00:46
  • What criteria do you use to decide to take a total which is above your desired total compared to stopping at a total which is smaller than your desired total? – MT0 Jan 09 '14 at 00:48
  • http://www.sqlfiddle.com/#!2/61d60/36 – Saif Hamed Jan 09 '14 at 01:25
  • 1
    @SaifHamed - That is MySQL not Oracle and the OP wanted to be able to get results which potentially total greater that the desired total (although they haven't specified under what circumstances these totals are acceptable). – MT0 Jan 09 '14 at 01:41
  • @MT0 Oops! every day I hate that master more and more. So,`The Oracle does not looks like MySQL`. I should change the college. lol – Saif Hamed Jan 09 '14 at 01:59

1 Answers1

0

Assuming you are talking the values in ascending order and you want to stop when you are closest to the desired total then:

SQL Fiddle

Oracle 11g R2 Schema Setup:

CREATE TABLE data ( ID, Value ) AS
          SELECT 226, 2.3 FROM DUAL
UNION ALL SELECT 331, 3.1 FROM DUAL
UNION ALL SELECT 25, 1.5 FROM DUAL
UNION ALL SELECT 28, 1.5 FROM DUAL
UNION ALL SELECT 29, 1.2 FROM DUAL
UNION ALL SELECT 52, 5.2 FROM DUAL
UNION ALL SELECT 38, 3.5 FROM DUAL;

Query 1:

WITH differences AS (
  SELECT ID,
         Value, 
         ABS(
           SUM( Value ) OVER ( ORDER BY Value ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW )
           - 6 -- REPLACE THIS WITH :desired_total
         ) AS difference
 FROM    data
),
min_difference AS (
  SELECT MIN( Value ) KEEP ( DENSE_RANK FIRST ORDER BY difference ASC ) AS max_value,
         MIN( ID ) KEEP ( DENSE_RANK FIRST ORDER BY difference ASC ) AS max_id
  FROM differences
)
SELECT ID,
       Value
FROM   differences d
       INNER JOIN
       min_difference m
       ON (  d.value < max_value
          OR ( d.value = m.max_value AND d.id <= max_id ) )

Results:

|  ID | VALUE |
|-----|-------|
| 226 |   2.3 |
|  25 |   1.5 |
|  28 |   1.5 |
|  29 |   1.2 |

Edit - Stops when running total is just greater than or equal to desired total

SQL Fiddle

Query 1:

Calculate the running totals for each row (in order of ascending value) then select all the rows where the running total is less than the desired total and also the next row (with the minimum running total of the totals greater than or equal to the desired total).

WITH running_totals AS (
  SELECT ID,
         Value,
         SUM( Value ) OVER ( ORDER BY Value ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) AS running_total
 FROM    data
)
SELECT ID,
       Value
FROM   running_totals
WHERE  running_total < 6
UNION ALL
SELECT MIN( id    ) KEEP ( DENSE_RANK FIRST ORDER BY running_total ),
       MIN( value ) KEEP ( DENSE_RANK FIRST ORDER BY running_total )
FROM   running_totals
WHERE  running_total >= 6

Results:

|  ID | VALUE |
|-----|-------|
|  29 |   1.2 |
|  28 |   1.5 |
|  25 |   1.5 |
| 226 |   2.3 |

EDIT 2 - An alternative method:

WITH running_totals AS (
  SELECT ID,
         Value,
         SUM( Value ) OVER ( ORDER BY Value, ID ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) AS running_total,
         ROW_NUMBER() OVER ( ORDER BY Value, ID ) AS idx
  FROM   data
)
SELECT ID,
       Value
FROM   running_totals
WHERE  idx <= (SELECT MAX(idx) + 1 FROM running_totals WHERE running_total < 6 );
MT0
  • 143,790
  • 11
  • 59
  • 117
  • So I tried the SQL fiddle out.. and at first it seems to be working. However, when I switch the value to something higher. For example if I try the value 10, it returns 226, 331, 25, 28, 29 which only totals up to 9.6 and does not meet the requirements since it should return one more to make it more than 10. In our case, it should also have id 38 in the returned list. – whoadiz Jan 10 '14 at 19:46
  • I put a comment on the original question asking what the criteria was for stopping the list - you haven't answered it. If you look at the assumption at the start of this answer then I said it stops when it gets to the closest total to the answer - since 9.6 is closer to 10 than (9.6+3.5=13.1) then that is where this answer will stop. I'll try and give you an edited answer tomorrow but if you would specify what criteria you want for the algorithm then it would mean we don't have to guess. – MT0 Jan 10 '14 at 23:59
  • Whoops, I guess I forgot to hit "add comment". Anyways the criteria is that the query should return the list of IDs that add up to >= given value (n) in the order of asc value and if all of the available values when summed are less than n, show all of the IDs. – whoadiz Jan 13 '14 at 16:43