11

I have a table 'propertyvalues' as follows:

ID  FileID  Property  Value
1 x Name 1.pdf
2 x Size 12567
3 x Type application/pdf
4 y Name 2.pdf
5 y Size 23576
6 y Type application/pdf
......
and so on

How to write a SQL query on the table above to fetch a result like below

 
FileID  Name     Size      Type
x 1.pdf 12567 application/pdf
y 2.pdf 23576 application/pdf
Taryn
  • 242,637
  • 56
  • 362
  • 405
Ankit Khatri
  • 253
  • 1
  • 4
  • 11
  • possible duplicate of [Get ROWS as COLUMNS (SQL Server dynamic PIVOT query)](http://stackoverflow.com/questions/12074939/get-rows-as-columns-sql-server-dynamic-pivot-query) – RichardTheKiwi May 03 '13 at 11:16

4 Answers4

11

You did not specify RDBMS, if you know the number of columns to transform then you can hard-code the values:

select FileId,
  max(case when property = 'Name' then value end) Name,
  max(case when property = 'Size' then value end) Size,
  max(case when property = 'Type' then value end) Type
from yourtable
group by FileId

This is basically a PIVOT function, some RDBMS will have a PIVOT, if you do then you can use the following, PIVOT is available in SQL Server, Oracle:

select *
from 
(
  select FileId, Property, Value
  from yourTable
) x
pivot
(
  max(value)
  for property in ([Name], [Size], [Type])
) p

If you have an unknown number of columns to transform, then you can use a dynamic PIVOT. This gets the list of columns to transform at run-time:

DECLARE @cols AS NVARCHAR(MAX),
    @query  AS NVARCHAR(MAX)

select @cols = STUFF((SELECT distinct ',' + QUOTENAME(property) 
                    from yourtable
            FOR XML PATH(''), TYPE
            ).value('.', 'NVARCHAR(MAX)') 
        ,1,1,'')

set @query = 'SELECT ' + @cols + ' from 
             (
                select FileId, Property, Value
                from yourtable
            ) x
            pivot 
            (
                max(value)
                for Property in (' + @cols + ')
            ) p '

execute(@query)
Taryn
  • 242,637
  • 56
  • 362
  • 405
  • Too complicated solution for too simple problem – Anthony Sep 27 '12 at 11:04
  • 2
    @Antonio I am not sure why you say it is too complicated. This is much easier than performing multiple joins. If they had 10 fields to transform, then you would have to perform 10 joins. – Taryn Sep 27 '12 at 11:05
  • Obviously there are just 3 fields. Neither more, nor less. – Anthony Sep 27 '12 at 11:07
  • 3
    but then you still have to perform multiple joins, this method has no joins, they are not needed for this type of query. Not only that, the OP possibly only showed a partial list of fields. Your query will work, but this it not wrong an IMO does not deserve a downvote. – Taryn Sep 27 '12 at 11:08
  • 2
    +1 - obviously not over-engineering, also three answers in one. – Nikola Markovinović Sep 27 '12 at 11:10
  • 1
    @Antonio You are terribly wrong. The first solution is even simpler than yours, the others are more generalized but still not complicated. – dezso Sep 27 '12 at 11:11
  • The solution with "CASE" will ALWAYS work slower than "JOIN" solution because JOIN can use RDBMS indexes. in tables with >1M records the diference is really huge. "PIVOT" approach is good but it is not supported by all the databases and is also slow. I recommend use JOINs because of the speed. – Anthony Sep 27 '12 at 11:22
  • I like Joins but they are not always faster. It depends. – ypercubeᵀᴹ Sep 27 '12 at 11:29
  • 3
    @Antonio "The solution with "CASE" will ALWAYS work slower than "JOIN" solution because JOIN can use RDBMS indexes." this is utter nonsense. Using 'CASE' like this *may* be heavier on the CPU, and *may* be lighter on i/o, but which is 'faster' will depend on your environment. If I had to guess I'd say that using CASE will *usually* be faster. –  Sep 27 '12 at 12:24
  • @JackDouglas case always won during my (rather limited) testing; the only time `join` came close to `max(case...)` was with index on (FileID, Property) and even then it didn't win. 3M records, different indexing. – Nikola Markovinović Sep 27 '12 at 12:37
  • Devil's advocate on a similar problem: [20 LEFT JOINS lightly faster than SUM(CASE) with GROUP BY](http://stackoverflow.com/questions/6605604/mysql-pivot-query-results-with-group-by/6606126#6606126) – ypercubeᵀᴹ Sep 27 '12 at 12:48
  • my preference would be `PIVOT`, I personally think it is easier than coding for each `CASE` statement. :) – Taryn Sep 27 '12 at 12:49
  • @ypercube MySQL is MySQL - my comments refer to SQL Server. I'd never hazard a guess about anything on MySQL as it seems to have so many quirks. –  Sep 27 '12 at 12:54
  • @Jack: An SQL-Server example (tested by the OP, not me so I cannot assert on indexes or distribution): [Totalling up ballot results](http://stackoverflow.com/questions/6608353/totalling-up-ballot-results/6609220#6609220) (and not exactly the same case, just a similarity) – ypercubeᵀᴹ Sep 27 '12 at 14:20
  • 1
    @JackDouglas - If optimal indexes are present then I would favour the join version for performance though not maintainability [as per my reasoning here](http://stackoverflow.com/questions/7448453/sql-server-pivot-vs-multiple-join/7449213#7449213) – Martin Smith Sep 27 '12 at 14:49
  • @Martin - interesting. I presume the same would not apply to CASE however as that will just be a full scan and an aggregate from a plan point of view. –  Sep 27 '12 at 16:18
9

A version with joins that works regardless of missing rows:

SELECT  
    pd.FileID 
  , p1.Value  AS Name
  , p2.Value  AS Size
  , p3.Value  AS Type
FROM
        ( SELECT DISTINCT FileID
          FROM propertyvalues 
        ) AS pd
    LEFT JOIN
        propertyvalues AS p1
            ON  p1.FileID = pd.FileID 
            AND p1.Property = 'Name'
    LEFT JOIN
        propertyvalues AS p2
            ON  p2.FileID = pd.FileID 
            AND p2.Property = 'Size'
    LEFT JOIN
        propertyvalues AS p3
            ON  p3.FileID = pd.FileID
            AND p3.Property = 'Type' ;

If you have a table where FileID is the primary key, you may replace the DISTINCT subquery with that table.


Regarding efficiency, it depends on a lot of factors. Examples:

  • Do all FileIDs have rows with Name, Size and Type and no other properties (and your table has a clustered index on (FileID, Property))? Then the MAX(CASE...) version would perform quite well as the whole table would have to be scanned anyway.

  • Are there (many) more than 3 properties and a lot of FileIDs have no Name, Size and Type, then the JOIN version would work well with an index on (Property, FileID) INCLUDE (Value) as only this index data would be used for the joins.

  • Not sure how efficient is the PIVOT version.

What I suggest though is to test the various versions with your data and table sizes, in your envirorment (version, disk, memory, settings, ...) before you select which one to use.

ypercubeᵀᴹ
  • 113,259
  • 19
  • 174
  • 235
0
Create function [dbo].[AF_TableColumns](@table_name nvarchar(55))
returns nvarchar(4000) as
begin
declare @str nvarchar(4000)
    select @str = cast(rtrim(ltrim(column_name)) as nvarchar(500)) + coalesce('         ' + @str , '            ') 
    from information_schema.columns
    where table_name = @table_name
    group by table_name, column_name, ordinal_position 
    order by ordinal_position DESC
return @str
end

--select dbo.AF_TableColumns('YourTable') Select * from YourTable
Chris Stillwell
  • 10,266
  • 10
  • 67
  • 77
-1
select 
  p1.FileID as FileID,
  p1.Value as Name,
  p2.Value as Size,
  p3.Value as Type
from 
  propertyvalues as p1 join 
  propertyvalues as p2 on p1.FileID = p2.FileID join 
  propertyvalues as p3 on p1.FileID = p3.FileID
where
  p1.Property='Name' AND p2.Property='Size' AND p3.Property='Type'
Anthony
  • 12,407
  • 12
  • 64
  • 88