0

Doing this seemingly trivial task should be simple and obvious using PIVOT - but isn't.

What is the cleanest way to do the conversion, not necessarily using pivot, when limited to ONLY using "pure" SQL (see other factors, below)?

It shouldn't affect the answer, but note that a Python 3.X front end is being used to run SQL queries on a MS SQL Server 2012 backend.

Background :

I need to create CSV files by calling SQL code from Python 3.x. The CSV header line is created from the field (column) names of the SQL table that holds the results of the query.
The following SQL code extracts the field names and returns them as N rows of 1 column - but I need them as 1 row of N columns. (In the example below, the final result must be "A", "B", "C" .)

CREATE TABLE #MyTable   -- ideally the real code uses "DECLARE @MyTable TABLE"
(           
    A  varchar( 32 ),   
    B  varchar( 32 ),   
    C  varchar( 32 )    
) ;
CREATE TABLE #MetaData  -- ideally the real code uses "DECLARE @MetaData TABLE"
(
    NameOfField varchar( 32 ) not NULL
) ;
INSERT INTO #MetaData 
SELECT   name 
FROM     tempdb.sys.columns as X
WHERE   ( object_id = Object_id( 'tempdb..#MyTable' ) )     
ORDER BY column_id ;    -- generally redundant, ensures correct order if results returned in random order
/*
OK so far, the field names are returned as 3 rows of 1 column (entitled "NameOfField"). 
Pivoting them into 1 row of 3 columns should be something simple like:
*/
SELECT NameOfField
FROM #MetaData AS Source
PIVOT
(
    COUNT( [ NameOfField ] )  FOR [ NameOfField ] 
                  IN ( #MetaData )   -- I've tried "IN (SELECT NameOfField FROM #Metadata)"
) AS Destination ;

This error gets raised twice, once for the COUNT and once for the "FOR" clause of the PIVOT statement:

Msg 207, Level 16, State 1, Line 32  
Invalid column name ' NameOfField'.

How do I use the contents of #Metadata to get PIVOT to work? Or is there another simple way?

Other background factors to be aware of:

  • OBDC (Python's pyodbc package) is being used to pass the SQL queries from - and return the results (a cursor) to - a Python 3.x front end. Consequently there is no opportunity to use any type of manual intervention before the result set is returned to Python.
  • The above SQL code is intended to become standard boilerplate for every query passed to SQL. The code must dynamically "adapt" itself to the structure of #MyTable (e.g. if field B is removed while D and E are added after C, the end result must be "A", "C","D", "E"). This means that the field names of a table must never appear inside PIVOT's IN clause (the #MetaData table is intended to supply those values).

  • "Standard" SQL must be used. ALL vendor specific (e.g. Microsoft) extensions/utilities (e.g. "bcp", sqlcmd) must be avoided unless there is a very compelling reason to use them (because "it's there" doesn't count).

  • For known reasons the select clause (into #Metadata) doesn't work for temporary variables (@MyTable). Is there an equivalent Select that works for temporary variables(i.e. @MetaData)?

UPDATE: This problem is subtly different from that in SQL Server dynamic PIVOT query?. In my case I have to preserve the order of the fields, something not required by that question.

WHY I NEED TO DO THIS:

  • The python code is a GUI for non-technical people. They use the GUI to pick & chose which (or even all) SQL reports to run from a HUGE number of reports.
  • Apps like Excel are being used to view these files: to keep our users happy each CSV file must have a header line. The header line will consist of the field names from the SQL table that holds the results of the query.
  • These scripts can change at any time (e.g. add/delete a column) without any advance notice. To meet our users needs the header line must automatically "adjust itself" to make the corresponding changes. The SQL code below accomplishes this.
  • The header line gets merged (using UNION) with the query results to form the result set (a cursor) that gets passed back to Python. Python then processes the returned data and creates the CSV file (including the header line) that gets used by our customers.

In a nutshell: We have many sites, many users, many queries. By having SQL "dynmically create" the header line we remove the headache of having to manually manage/coordinate/rollout the SQL changes to all affected parties.

user1459519
  • 712
  • 9
  • 20
  • 2
    Possible duplicate of [SQL Server dynamic PIVOT query?](https://stackoverflow.com/questions/10404348/sql-server-dynamic-pivot-query) – Tab Alleman Mar 02 '18 at 20:20
  • The only way to do what you want is with dynamic SQL. – Tab Alleman Mar 02 '18 at 20:28
  • Thanks for the hint. I had run across that question (and many others) before posting. I don't use SQL often so I've no experience with that technique, The example uses EXPLICIT column names (date, amount category) which violates my 2nd bullet point. Perhaps replacing that clause with Select * will finesse that problem? – user1459519 Mar 02 '18 at 20:44
  • So you want a dynamic pivot that can do its pivot against any data structure. Good luck with that. That is not how data retrieval works at all. And why in the world are trying to do this with table variables? You have quite a laundry list of rather strange restrictions here with an extremely difficult task. – Sean Lange Mar 02 '18 at 20:51
  • You'll end up needing to build a dynamic SQL String that explicitly lists columns. You build the string by selecting from your MetaData table. – Tab Alleman Mar 02 '18 at 20:52
  • @TabAlleman, my thanks, I learned something new today. I was able to modify the answer in the linked question to solve my problem (my version is posted below). One small issue though: I'm uneasy about having to comment out the "DISTINCT" keyword in order to preserve the field order. Any Idea/hint as to why this happens? – user1459519 Mar 03 '18 at 01:20

2 Answers2

0

I am unsure what "pure" sql is. Are you refering to ANSI-92 SQL?

Anyhow, if you can use SQL variables, try this:

DECLARE @STRING VARCHAR(MAX)

SELECT  @STRING = COALESCE(@STRING + ', ' + '"' + NameOfField + '"', '"' + NameOfField + '"')
FROM    #MetaData  

SELECT   @STRING 

/*
Results:
"A", "B", "C"
*/
mikeinman
  • 31
  • 3
  • I worded it badly. I was trying to say that for compatability reasons I need to only use "industry standard" SQL and must avoid using any vendor specific "enhancements". (e.g.I don't know how portable Microsofts(?) "bcp", "sqlcmd", etc. are to Oracle, MySql, etc.) – user1459519 Mar 03 '18 at 01:30
0

To @Tab Alleman, thanks. I was able to modify the answer to SQL Server dynamic PIVOT query? to do the swap (see below) in a way that meets all my needs.

NOTE: For some reason the "DISTINCT" keyword places the fields in alphabetical order - something I don't want.
Commenting that word out (as done below) preserves the order of the fields. I'm a bit uneasy about doing this but in this case it should be safe because the values being selected into #MetaData are guaranteed to be unique.
The difference can be easily seen by swapping fields A & B in #MyTable and uncommenting the "DISTINCT" keyword

--drop table #MyTable 
--drop table #MetaData 

Create TABLE #MyTable 
(           
    A  varchar( 10 ),
    B  varchar( 10 ),
    C  varchar( 10 )
)
;
CREATE TABLE #MetaData 
(
   NameOfField  varchar( 100 ) not NULL,
   Position int
)
;

INSERT INTO #MetaData 
SELECT   name, column_id
FROM     tempdb.sys.columns as X
WHERE    ( object_id = Object_id( 'tempdb..#MyTable' ) ) 
--ORDER BY column_id    -- normally redundant, guards against results being returned in random order
;

select * from #MetaData 

DECLARE @cols AS NVARCHAR(MAX),
        @query  AS NVARCHAR(MAX);

SET @cols = STUFF( (SELECT 
--                  DISTINCT 
                    ',' + QUOTENAME( c.NameOfField ) 
            FROM #MetaData AS  c

            FOR XML PATH(''), TYPE
            ).value('.', 'NVARCHAR(MAX)') 
        ,1,1,'')

--print( @cols )

set @query = 'SELECT ' + @cols + ' from 
           (
                select NameOfField
                from #MetaData
           ) AS x
            pivot 
            (
                MAX( NameOfField )
                for NameOfField in ( '+ @cols + ' )
            ) AS p 

            '
--print( @query )

execute( @query )

drop table #MyTable 
drop table #MetaData 
user1459519
  • 712
  • 9
  • 20