0

I have the following working query but I was wondering if I could simplify it using PIVOT:

SELECT P.Date, 
       P.OI AS [Total Puts], 
       C.OI AS [Total Calls], 
       P.[Index Level]
FROM (SELECT Date, 
             SUM(OpenInterest) AS OI, 
             AVG(TheSpot) AS [Index Level]
      FROM Fullstats            
      WHERE theType = 'P'            
        and Code = 'ALSI'            
      GROUP BY Date) AS P
JOIN (SELECT Date, 
             SUM(OpenInterest) AS OI
      FROM Fullstats            
      WHERE theType = 'C'            
        and Code = 'ALSI'            
      GROUP BY Date) AS C            
ON P.Date = C.Date
ORDER BY Date 

I can get the first 3 columns easily:

SELECT [Date], 
       P AS [Total Puts],
       C AS [Total Calls]
FROM (SELECT Date, 
             OpenInterest, 
             theType 
      FROM FullStats 
      WHERE Code = 'ALSI') AS SourceTable
PIVOT
(
    SUM(OpenInterest)
    FOR theType IN (P, C)
) AS PivotTable 
ORDER BY Date;

but I wasn't sure how to get that fourth (AVG(TheSpot) AS [Index Level]) column. So I played around a bit and found that the following works:

SELECT [Date],
       P AS [Total Puts],
       C AS [Total Calls],
       theSpot       
FROM (SELECT Date, 
             OpenInterest, 
             theType, 
             theSpot 
      FROM FullStats 
      WHERE Code = 'ALSI' AND theType <> 'F') AS SourceTable
PIVOT
(
    SUM(OpenInterest)
    FOR theType IN (P, C)
) AS PivotTable
ORDER BY Date;

But I don't understand why this is giving me the average of the theSpot column? Is that a feature of PIVOT that it returns the average for any field not aggregated? Could I have gotten the SUM or MIN instead? Lastly how would I ascertain which of these two queries is more efficient?

Dan
  • 45,079
  • 17
  • 88
  • 157
  • Can you post some sample data or even create a sql fiddle with your table and data? Columns that are not aggregated get grouped by, there shouldn't be an average happening. – Taryn Aug 21 '14 at 11:22
  • @bluefeet I will if I get a chance. – Dan Aug 21 '14 at 13:17
  • Found this regarding the efficiency, it sound like the `join` version is faster (obviously very case dependent): http://stackoverflow.com/questions/7448453/sql-server-pivot-vs-multiple-join – Dan Aug 21 '14 at 13:18

1 Answers1

1

Not aggregating does not give you the average of "TheSpot" It might be that the value you take comes from a view that aggregates and somehow you lucked out on this. Check out the query below for a way to get the average.

create table pivot_demo (
    Id int identity(1,1)
    , Value1 decimal
    , Value2 decimal
    , SetKey int)

insert pivot_demo (Value1, Value2, SetKey) values 
(10,10,1),
(20,20,1),
(30,30,1),
(40,40,1),
(50,50,1),
(60,60,1),
(10,10,2),
(20,20,2),
(30,30,2),
(40,40,3),
(50,50,3),
(60,60,3)

select value2, [1], [2], [3] from (
    select value1, avg(value2) value2, setkey 
    from demo..pivot_demo
    group by Value1, SetKey) sub
pivot (sum(value1) for setkey in ([1], [2], [3])) piv
Tristan
  • 1,004
  • 7
  • 14
  • Hmmm... there is definitely no view. However it could well be (should be in fact) that the value for `theSpot` is unique for each `date` even though there are many rows for each distinct date. Maybe it's just returning any random value, which since they are all the same will be the average. I'm surprised it allows me to have `theSpot` in the `SELECT` clause in that case though for the same reason it doesn't allow you to if you `GROUP BY` not on that field and then try select it without aggregating. – Dan Aug 21 '14 at 13:11
  • BTW your suggested query form does indeed work, thanks! I'd still like to know exactly why it even allows my form though... – Dan Aug 21 '14 at 13:15
  • @Dan Mark my answer as accepted and I'll tell you ;) – Tristan Aug 21 '14 at 13:16
  • @Dan, I already gave the right way to do it with a code example right? Anyway what is happening in you example is the "hidden" group by from the pivot. Because no aggregate is specified you just get each value – Tristan Aug 21 '14 at 13:22
  • OK so `PIVOT` inherently groups by all the fields in the `FROM` except for the two inside the `PIVOT`. Since my `Date` and `theSpot` are in this case always matched, I get the same value as if I took an average. So actually in my case I should be sticking with my query and not explicitly calling for an average (well actually I should be sticking with the faster `JOIN` based example. Thanks! – Dan Aug 21 '14 at 13:29
  • Hey Dan, happy too see you got it. I want to warn you for using such inferred logic because it will give you headaches in the future. – Tristan Aug 21 '14 at 13:32