1

My dataset looks like this:

COLA  | COLB
Name1 | 218
Name2 | 157
Name3 | 134
Name4 | 121

I need this output:

COLA  | COLB| COLC
Name1 | 218 | 0.34
Name2 | 157 | 0.60
Name3 | 134 | 0.71
Name4 | 121 | 1

My SQL looks like this so far:

SELECT COLA, COLB, COLB/SUM(COLB) FROM #MyTempTable

Two problems with this SQL. One, COLC is 0 everytime and I don't understand that. Two, even if it did result in the % it's not a cumulative %.

I've seen some similar threads on StackOverflow, but I wasn't able to make the answers from those threads work in my exact scenario.

Thanks in advance for any suggestions!

Tim Lehner
  • 14,813
  • 4
  • 59
  • 76
s15199d
  • 7,261
  • 11
  • 43
  • 70

3 Answers3

4

I think you're looking for something like this, though your example calculations may be off a little:

SELECT
    COLA,
    COLB,
    ROUND(
        -- Divide the running total...
        (SELECT CAST(SUM(COLB) AS FLOAT) FROM #MyTempTable WHERE COLA <= a.COLA) /
        -- ...by the full total
        (SELECT CAST(SUM(COLB) AS FLOAT) FROM #MyTempTable),
        2
    ) AS COLC
FROM #MyTempTable AS a
ORDER BY COLA

EDIT: I've added rounding.

This gives us the following output:

COLA    COLB    COLC
Name1   218     0.35
Name2   157     0.6
Name3   134     0.81
Name4   121     1

The reason that your results are 0 (or 1) is because you are dividing ints by ints, thus giving you an int (see Datatype precedence).

UPDATE:

I should add that this uses a "triangular join" to get the running total (WHERE COLA <= a.COLA). Depending upon your SQL Server version, you may compare this to other options if performance becomes a concern.

Community
  • 1
  • 1
Tim Lehner
  • 14,813
  • 4
  • 59
  • 76
  • Tim thanks a bunch. I need to be able to ORDER BY COLB DESC. I've tried this... SELECT TOP(10) COLA, COLB, ROUND( -- Divide the running total... (SELECT CAST(SUM(COLB) AS FLOAT) FROM #MyTempTable WHERE COLB <= a.COLB) / -- ...by the full total (SELECT CAST(SUM(COLB) AS FLOAT) FROM #MyTempTable), 2 ) AS COLC FROM #MyTempTable AS a ORDER BY COLB DESC – s15199d Aug 28 '12 at 14:49
  • Nevermind...I got it... SELECT TOP(10) COLA, COLB, ROUND( -- Divide the running total... (SELECT CAST(SUM(COLB) AS FLOAT) FROM #MyTempTable WHERE COLB >= a.COLB) / -- ...by the full total (SELECT CAST(SUM(COLB) AS FLOAT) FROM #MyTempTable), 2 ) AS COLC FROM #MyTempTable AS a ORDER BY COLB DESC – s15199d Aug 28 '12 at 14:51
1

If you don't use OLAP functions, then you have to do a weird self-join on the table:

SELECT a.ColA, a.ColB, SUM(b.ColB) AS ColX
  FROM #MyTempTable AS a
  JOIN #MyTempTable AS b
    ON a.ColA <= b.ColA
 GROUP BY a.ColA, a.ColB

This gives you the raw cumulative SUM. You can definitely use that as a sub-query to get the answer, noting that to get the percentage, you need to divide the cumulative sum by the gross sum:

SELECT ColA, ColB, ColX / (SELECT SUM(ColB) FROM MyTempTable) AS ColC
  FROM (SELECT a.ColA, a.ColB, SUM(b.ColB) AS ColX
          FROM #MyTempTable AS a
          JOIN #MyTempTable AS b
            ON a.ColA <= b.ColA
         GROUP BY a.ColA, a.ColB
       ) AS X
 ORDER BY ColA

You may be able to write just:

SELECT a.ColA, a.ColB, SUM(b.ColB) / (SELECT SUM(ColB) FROM MyTempTable) AS ColC
  FROM #MyTempTable AS a
  JOIN #MyTempTable AS b
    ON a.ColA <= b.ColA
 GROUP BY a.ColA, a.ColB
 ORDER BY a.ColA

Multiply the ColC expression by 100 to get a percentage instead of a fraction.

Tested against IBM Informix 11.70.FC2 on Mac OS X 10.7.3, both the queries with division work, producing the same answer (and I note that I get 0.81 instead of 0.71 as required in the question):

Name1    218    0.34603174603174603174603174603175
Name2    157    0.5952380952380952380952380952381
Name3    134    0.80793650793650793650793650793651
Name4    121    1.0 

You might have to use a CAST to ensure the division is done using floating point instead of integer arithmetic — as you can see, that wasn't necessary with Informix (the SUM is a floating point decimal anyway, just in case the table has billions of rows in it, not just 4 of them). I could improve the presentation using ROUND(xxxx, 2) to get just 2 decimal places; a cast to DECIMAL(6,2) would achieve the same result, but the client should be responsible for the presentation, not the DBMS.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
0

In MS SQL Server, this does it (ups, wrong subaggregation -> wrong result):

create table #MyTempTable (cola varchar(10), colb int)

insert into #MyTempTable(cola,colb)
select 'Name1',218
union all
select 'Name2',157
union all
select 'Name3',134
union all
select 'Name4',121

SELECT otab.COLA, otab.COLB,
       cast(otab.COLB as float)/(select SUM(cast(itab.colb as float))
                                 from #MyTempTable itab where itab.cola >= otab.cola) 
  from #MyTempTable otab

drop table #MyTempTable
Christian
  • 1
  • 1
  • I tried casting CAST(COLB/(SELECT SUM(COLB) FROM #MyTempTable) AS DECIMAL(3,2)) AS Pct_Total. I see that I needed to cast just the dividend. Thanks! – s15199d Aug 28 '12 at 14:25
  • Welcome to Stack Overflow. Using the MarkDown system to present an answer takes some getting used to. To get the code layout sane, indent by four spaces. When you're editing, the icon `{}` just above the edit box will indent the highlighted material four spaces if there is a character in the first (three) column(s) — it also unindents if there isn't any character in the first column. There are help buttons available. (It doesn't take long to get the hang of it all.) – Jonathan Leffler Aug 28 '12 at 14:34