How does PIVOT in sql knows the column to distinct?

Question

Take for example this sample: http://blogs.msdn.com/b/spike/archive/2009/03/03/pivot-tables-in-sql-server-a-simple-sample.aspx

How did the query know to find the average for each vendor? The query doesn't contain any reference to the VendorId column, so I can't see how could it know to collect the IncomeAmounts per vendor.
How does it choose the group of rows on which it executes the PIVOT function?

score 5 · Accepted Answer · answered Jun 04 '14 at 06:55

A pivot request basically involves three logical processing phases, each with associated elements as :

Grouping phase
Spreading phase
and an aggregation phase with an associated aggregation element and aggregate function.

What's important here is the fact that with the PIVOT operator you do not explicitly specify the grouping elements,removing the need for GROUP BY in the query. The PIVOT operator figures out the grouping elements implicitly as all attributes from the source table (or table expression) that were not specified as either the spreading element or the aggregation element. So you must ensure that the source table for the PIVOT operator has no attributes besides the grouping, spreading, and aggregation elements,so that after specifying the spreading and aggregation elements, the only attributes left are those you intend as grouping elements.As part of best practice you achieve this by not applying the PIVOT operator to the original table directly but instead to a table expression that includes only the attributes representing the pivoting elements and no others.

So mapping these phases with requirement in given case:

spreading has to be done based on 'IncomeDay' column values with final column names as: [MON],[TUE],[WED],[THU],[FRI],[SAT],[SUN].
'IncomeAmount' values are to be aggregated to produce intersecting values for grouping and spreading elements
As the table has only 3 columns 'VendorId' qualifies for the column on which grouping will be done.

Putting these values in standard Pivot statement:

SELECT ...
FROM <source_table_or_table_expression>
PIVOT(<agg_func>(<aggregation_element>)
FOR <spreading_element>
IN (<list_of_target_columns>)) AS <result_table_alias>

The query becomes:

select * from DailyIncome
pivot (avg (IncomeAmount) 
for IncomeDay 
in ([MON],[TUE],[WED],[THU],[FRI],[SAT],[SUN])) as AvgIncomePerDay

Hope this helps!!

Wow, thank you very much, really cleared that up for me – user3688929 Jun 04 '14 at 07:04 — user3688929, Jun 04 '14 at 07:04

score 0 · Answer 2 · edited May 23 '17 at 12:20

Pretty detailed answer on pivots here:

MySQL pivot table

Some more detail:

Microsoft Books Online - PIVOT / UNPIVOT

For your specific sample, it is a nice source. However your question is slightly wrong, it isn't the average it is the max - see pivot column pivot (max(incomeamount)... for IncomeDay....)

select * from DailyIncome     -- Colums to pivot
 pivot ( max (IncomeAmount)   -- Pivot on this column
 for IncomeDay in ([MON],[TUE],[WED],[THU],[FRI],[SAT],[SUN]))   -- Make colum where IncomeDay is in one of these.
 as MaxIncomePerDay           -- Pivot table alias
 where VendorId in ('SPIKE')  -- Select only for this vendor

The vendorID is determined by the where clause, see the authors comments.

I would simply recommend following the tutorial yourself, create the tables and play with the queries so you can see what is going on.

Deepsikha nailed the real answer:

with the PIVOT operator you do not explicitly specify the grouping elements,removing the need for GROUP BY in the query

, as the column only has three columns it the vendorID qualifies for grouping –

I'm not sure you've understood my question.. Even if it's Max, and without the where clause, the questions stays as it is. the query selects for each vendor, the max incomeamount for each day. I was asking how does the query choose to collect this data.. Let's say, in the first query in that sample, with the AVG function, the query identifies all of SPIKE's, FREDS' and JOHNS' lines, and for each one of them calculating the avg income per day. I ask how did it choose to user the VendorID as the column to group by.. — user3688929, Jun 04 '14 at 07:01
Deepsikha nailed it in the statement: do not explicitly specify the grouping elements,removing the need for GROUP BY in the query, as the column only has three columns it the vendorID qualifies for grouping — n34_panda, Jun 04 '14 at 07:13

How does PIVOT in sql knows the column to distinct?

2 Answers2