Doing this seemingly trivial task should be simple and obvious using PIVOT - but isn't.
What is the cleanest way to do the conversion, not necessarily using pivot, when limited to ONLY using "pure" SQL (see other factors, below)?
It shouldn't affect the answer, but note that a Python 3.X front end is being used to run SQL queries on a MS SQL Server 2012 backend.
Background :
I need to create CSV files by calling SQL code from Python 3.x. The CSV header line is created from the field (column) names of the SQL table that holds the results of the query.
The following SQL code extracts the field names and returns them as N rows of 1 column - but I need them as 1 row of N columns. (In the example below, the final result must be "A", "B", "C" .)
CREATE TABLE #MyTable -- ideally the real code uses "DECLARE @MyTable TABLE"
(
A varchar( 32 ),
B varchar( 32 ),
C varchar( 32 )
) ;
CREATE TABLE #MetaData -- ideally the real code uses "DECLARE @MetaData TABLE"
(
NameOfField varchar( 32 ) not NULL
) ;
INSERT INTO #MetaData
SELECT name
FROM tempdb.sys.columns as X
WHERE ( object_id = Object_id( 'tempdb..#MyTable' ) )
ORDER BY column_id ; -- generally redundant, ensures correct order if results returned in random order
/*
OK so far, the field names are returned as 3 rows of 1 column (entitled "NameOfField").
Pivoting them into 1 row of 3 columns should be something simple like:
*/
SELECT NameOfField
FROM #MetaData AS Source
PIVOT
(
COUNT( [ NameOfField ] ) FOR [ NameOfField ]
IN ( #MetaData ) -- I've tried "IN (SELECT NameOfField FROM #Metadata)"
) AS Destination ;
This error gets raised twice, once for the COUNT and once for the "FOR" clause of the PIVOT statement:
Msg 207, Level 16, State 1, Line 32
Invalid column name ' NameOfField'.
How do I use the contents of #Metadata to get PIVOT to work? Or is there another simple way?
Other background factors to be aware of:
- OBDC (Python's pyodbc package) is being used to pass the SQL queries from - and return the results (a cursor) to - a Python 3.x front end. Consequently there is no opportunity to use any type of manual intervention before the result set is returned to Python.
The above SQL code is intended to become standard boilerplate for every query passed to SQL. The code must dynamically "adapt" itself to the structure of #MyTable (e.g. if field B is removed while D and E are added after C, the end result must be "A", "C","D", "E"). This means that the field names of a table must never appear inside PIVOT's IN clause (the #MetaData table is intended to supply those values).
"Standard" SQL must be used. ALL vendor specific (e.g. Microsoft) extensions/utilities (e.g. "bcp", sqlcmd) must be avoided unless there is a very compelling reason to use them (because "it's there" doesn't count).
- For known reasons the select clause (into #Metadata) doesn't work for temporary variables (@MyTable). Is there an equivalent Select that works for temporary variables(i.e. @MetaData)?
UPDATE: This problem is subtly different from that in SQL Server dynamic PIVOT query?. In my case I have to preserve the order of the fields, something not required by that question.
WHY I NEED TO DO THIS:
- The python code is a GUI for non-technical people. They use the GUI to pick & chose which (or even all) SQL reports to run from a HUGE number of reports.
- Apps like Excel are being used to view these files: to keep our users happy each CSV file must have a header line. The header line will consist of the field names from the SQL table that holds the results of the query.
- These scripts can change at any time (e.g. add/delete a column) without any advance notice. To meet our users needs the header line must automatically "adjust itself" to make the corresponding changes. The SQL code below accomplishes this.
- The header line gets merged (using UNION) with the query results to form the result set (a cursor) that gets passed back to Python. Python then processes the returned data and creates the CSV file (including the header line) that gets used by our customers.
In a nutshell: We have many sites, many users, many queries. By having SQL "dynmically create" the header line we remove the headache of having to manually manage/coordinate/rollout the SQL changes to all affected parties.