4

I have something like this:

 Name.....Value
 A...........10 
 B............9
 C............8

Meaning, the values are in descending order. I need to create a new table that will contain the values that make up 60% of the total values. So, this could be a pseudocode:

set Total = sum(value) 
set counter = 0 
foreach line from table OriginalTable do: 
counter = counter + value 
if counter > 0.6*Total then break
else insert line into FinalTable
end

As you can see, I'm parsing the sql lines here. I know this can be done using handlers, but I can't get it to work. So, any solution using handlers or something else creative will be great. It should also be in a reasonable time complexity - the solution how to select values that sum up to 60% of the total works, but it's slow as hell :(
Thanks!!!!

Community
  • 1
  • 1
n00b programmer
  • 2,671
  • 7
  • 41
  • 56
  • 2
    As per the question referenced, you may get better responses if you can tell people which database engine you're using and which language you're using to do the call – BugFinder May 31 '11 at 11:32

2 Answers2

1

You'll likely need to use the lead() or lag() window function, possibly with a recursive query to merge the rows together. See this related question:

merge DATE-rows if episodes are in direct succession or overlapping

And in case you're using MySQL, you can work around the lack of window functions by using something like this:

Mysql query problem

Community
  • 1
  • 1
Denis de Bernardy
  • 75,850
  • 13
  • 131
  • 154
0

I don't know which analytical functions SQL Server (which I assume you are using) supports; for Oracle, you could use something like:

select v.*,
  cumulative/overall percent_current,
  previous_cumulative/overall percent_previous from (
  select 
    id, 
    name, 
    value, 
    cumulative,
    lag(cumulative) over (order by id) as previous_cumulative,
    overall
  from (
    select 
      id, 
      name, 
      value, 
      sum(value) over (order by id) as cumulative,
      (select sum(value) from mytab) overall
    from mytab
    order by id) 
) v

Explanation:
- sum(value) over ... computes a running total for the sum
- lag() gives you the value for the previous row
- you can then combine these to find the first row where percent_current > 0.6 and percent_previous < 0.6

Frank Schmitt
  • 30,195
  • 12
  • 73
  • 107
  • do your window functions work as intended without specifying a `ROWS BETWEEN .. AND ..` clause? – Lukas Eder May 31 '11 at 11:48
  • @LukasEder AFAIK, yes. If I understand you correctly, you would suggest adding "rows between unbounded preceding and current row" for the sum() windowing function? – Frank Schmitt May 31 '11 at 12:01
  • @Frank, yes exactly. Is that the default behaviour for `xx(..) over (order by ..)` ? – Lukas Eder May 31 '11 at 12:08
  • I'm having trouble with "over (order by ID)". What is the ID in this case? The table is sorted according to "value", and the only other field is "Name". Also, I'm not sure MySql indentifies the "over" command :( – n00b programmer May 31 '11 at 12:21
  • No, MySQL does not support window functions / `OVER(...)` clauses – Lukas Eder May 31 '11 at 12:23
  • @Lukas According to the Oracle documentation http://download.oracle.com/docs/cd/B28359_01/server.111/b28286/functions078.htm, the default is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, which should be fine here. – Frank Schmitt May 31 '11 at 13:20
  • @n00b Sorry, I've added the ID clause to my test table as a primary key. For you, it would make sense to use name instead (if you want to sort by name, that is). – Frank Schmitt May 31 '11 at 13:21