3

Say I have a table of data that looks like:

ItemNo    |    ItemCount   |    Proportion
------------------------------------------
1              3                0.15 
2              2                0.10
3              3                0.15
4              0                0.00
5              2                0.10
6              1                0.05
7              5                0.25
8              4                0.20

In other words, there are a total of 20 items, and the cumulative proportion of each ItemNo sums to 100%. The ordering of the table rows is important here.

Is it possible to perform a SQL query without loops or cursors to return the first ItemNo which exceeds a cumulative proportion?

In other words if the 'proportion' I wanted to check was 35%, the first row which exceeds that is ItemNo 3, because 0.15 + 0.10 + 0.15 = 0.40

Similarly, if I wanted to find the first row which exceeded 75%, that would be ItemNo 7, as the sum of all Proportion up until that row is less than 0.75.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Widor
  • 13,003
  • 7
  • 42
  • 64

2 Answers2

5
select top 1
  t1.ItemNo
from
  MyTable t1
where
  ((select sum(t2.Proportion) from MyTable t2 where t2.ItemNo <= t1.ItemNo) >= 0.35)
order by
  t1.ItemNo
njr101
  • 9,499
  • 7
  • 39
  • 56
3

A classic for a window function:

SELECT * 
FROM   (
    SELECT ItemNo
          ,ItemCount
          ,sum(Proportion) OVER (ORDER BY ItemNo) AS running_sum
    FROM   tbl) y
WHERE  running_sum > 0.35
LIMIT  1;

Works in PostgreSQL, among others.

Or, in tSQL notation (which you seem to use):
SELECT TOP 1 *
FROM (
SELECT ItemNo
,ItemCount
,sum(Proportion) OVER (ORDER BY ItemNo) AS running_sum
FROM tbl) y
WHERE running_sum > 0.35;

Doesn't work in tSQL as commented below.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Is there a subtle difference in the syntax for use in Sql Server? When using your second example, I get `Incorrect syntax near 'order'.` – Widor Oct 27 '11 at 12:50
  • I may be mistaken but I don't think the OVER clause can be used to generate running totals in SQL Server, only rankings or a sum over a partition. See http://stackoverflow.com/questions/860966/calculate-a-running-total-in-sqlserver/861073#861073 This is why I chose a subquery instead. – njr101 Oct 27 '11 at 13:26
  • @njreed.myopenid.com: Ah, looks like you are right, doesn't work with tSQL. Good link. I crossed out the tSQL version. Please remember to disclose your RDBMS in your nest question. :) – Erwin Brandstetter Oct 27 '11 at 13:59
  • @Erwin Thanks for clarification. Not specifying an RDBMS was a deliberate choice in order to get a generic SQL solution without the need to rely on vendor-specific functionality. – Widor Oct 27 '11 at 14:36