0

I have a raw data like


Title    Question  Answer  AnswerRemark
----------------------------------------
ACCCode1    Q1       Y        NULL
ACCCode1    Q2       N        6
ACCCode1    Q3       Y        Workout

As you can see the field "AnswerRemark" is free text for "Answer", some answer is not require remark.

I can simply pivot the question and answer like:

Title         Q1   Q2   Q3
AccessCode1   Y    N    Y

My desired Result will be

Title         Q1   R1   Q2   R2   Q3   R3
AccessCode1   Y   NULL  N    6     Y   Workout 

Is that possible? I can not figure it out how to achieve this, pivot the Answer is not good idea as it have many combinations.

Any suggestion?

Alan LI
  • 3
  • 1
  • HOw many questions could there be? Only 3? – Thom A Dec 29 '17 at 09:52
  • Thank you for your reply. No, lot of questions, it depends on user request – Alan LI Dec 29 '17 at 09:59
  • If you could have an unlimited number of Questions, and Answers, this is going to get ugly, quickly. you're going to need to use Dynamic SQL to achieve this. I'll see if i get a chance to do this later for you, but most likely any solution is going to perform poorly at best. Things like this are normally better being done on the presentation layer. – Thom A Dec 29 '17 at 10:19
  • Thanks Larnu. Yes I agree, it must be dynamic sql in this case; actually I solved this problem by a stupid workaround, append the remark behind the answer, of course it is not the ideal one, it just a quick fix. I done this in sql server because user need the result in excel pivot table, directly read from DB. Seems like the second reply solve my problem, I’ll try next week. – Alan LI Dec 29 '17 at 10:34

3 Answers3

2

Use Conditional Aggregation :

SELECT Title,
    MAX(CASE WHEN Question='Q1' THEN Answer  END) as Q1  ,
    MAX(CASE WHEN Question='Q1' THEN AnswerRemark  END) as R1 ,
    MAX(CASE WHEN Question='Q2' THEN Answer  END) as Q2  ,
    MAX(CASE WHEN Question='Q2' THEN AnswerRemark  END) as R2 ,
    MAX(CASE WHEN Question='Q3' THEN Answer  END) as Q3  ,
    MAX(CASE WHEN Question='Q3' THEN AnswerRemark  END) as R3 
FROM [tablename]
GROUP BY Title
  • This is the right idea with a determinable amount of questions, and answers. The OP, however, has advised this isn't the case. – Thom A Dec 29 '17 at 10:21
1

Using Pivot we get the result

 ;With cte(Title, Question,Answer,AnswerRemark)
AS
(
SELECT 'ACCCode1','Q1','Y',NULL      UNION ALL
SELECT 'ACCCode1','Q2','N','6'       UNION ALL
SELECT 'ACCCode1','Q3','Y','Workout' UNION ALL
SELECT 'ACCCode1','Q2','N','7'       UNION ALL
SELECT 'ACCCode1','Q1','Y',NULL      UNION ALL
SELECT 'ACCCode1','Q3','N','9'       UNION ALL
SELECT 'ACCCode1','Q1','N','4'       UNION ALL
SELECT 'ACCCode1','Q2','N','Workout' UNION ALL
SELECT 'ACCCode1','Q4','N','2'       UNION ALL
SELECT 'ACCCode1','Q3','Y','Workout' UNION ALL
SELECT 'ACCCode1','Q1','N','1'       UNION ALL
SELECT 'ACCCode1','Q4','Y',NULL
)

SELECT *,'Remark'+CAST(ROW_NUMBER()OVER(ORDER BY (SELECT 1))AS varchar(10)) AS Question2 
, ROW_NUMBER()OVER(PArtition  by Question Order by Question ) AS Seq  
INTO #t FROM cte

Using Dynamic Sql where the columns are not static

DECLARE @DyColumn1 Nvarchar(max),
        @DyColumn2 Nvarchar(max),
        @Sql Nvarchar(max),
        @MAxDyColumn1 Nvarchar(max),
        @MAxDyColumn2 Nvarchar(max),
        @CombineColumn Nvarchar(max)


SELECT @DyColumn1=STUFF((SELECT DISTINCT ', '+QUOTENAME(Question) FROM #t FOR XML PATH ('')),1,1,'')
SELECT @DyColumn2=STUFF((SELECT ', '+QUOTENAME(Question2) FROM #t FOR XML PATH ('')),1,1,'')

SELECT @MAxDyColumn1=STUFF((SELECT DISTINCT ', '+'MAX('+QUOTENAME(Question)+') AS '+QUOTENAME(Question) FROM #t FOR XML PATH ('')),1,1,'')
SELECT @MAxDyColumn2=STUFF((SELECT ', '+'MAX('+QUOTENAME(Question2)+') AS '+QUOTENAME(Question2) FROM #t FOR XML PATH ('')),1,1,'')

SELECT @CombineColumn=STUFF((SELECT DISTINCT ', '+QUOTENAME(Question)+','+QUOTENAME(Question2) FROM #t FOR XML PATH ('')),1,1,'')



SET @Sql='SELECT Title,'+@CombineColumn+' From
(
SELECT Title,'+@MAxDyColumn1+','+@MAxDyColumn2+' FRom
(
SELECT * FROM #t
)AS SRC
    PIVOT
    (
    MAX(Answer) FOR Question IN('+@DyColumn1+')
    ) AS Pvt1

    PIVOT
    (
    MAX(AnswerRemark) FOR Question2 IN('+@DyColumn2+')
    ) AS Pvt2
    GROUP BY Title
    )dt
'
PRINT @Sql
EXEC(@Sql)

Result

Title   Q1  Remark1 Q1  Remark2 Q1  Remark3 Q1  Remark4 Q2  Remark5 Q2  Remark6 Q2  Remark7 Q3  Remark8 Q3  Remark9 Q3  Remark10    Q4  Remark11    Q4  Remark12
ACCCode1    Y   NULL    Y   1   Y   4   Y   NULL    N   6   N   Workout N   7   Y   Workout Y   Workout Y   9   Y   NULL    Y   2
Sreenu131
  • 2,476
  • 1
  • 7
  • 18
  • Thanks Sreenu131. I’ll try your answer next week. Actually I did see the similar sample like yours but I was failed to simulate, maybe I’m not quite understand the logic. Thanks again. Will let you know. – Alan LI Dec 29 '17 at 10:43
  • That's fine bro,using dynamic sql we can avoid static code ,it generates the column dynamically – Sreenu131 Dec 29 '17 at 11:00
  • Thank you Sreenu131, your answer work perfectly. Just a little amendment to work for me: The dynamic sql need to combine the table declaration, and I changed "SELECT * INTO #t FROM cte" to "SELECT *,'Remark'+CAST(ROW_NUMBER()OVER(ORDER BY (SELECT 1))AS varchar(10)) AS Question2 INTO #t FROM cte", because it need to work with "Question2" column. thanks again! – Alan LI Jan 02 '18 at 04:16
  • Happy to help bro :) – Sreenu131 Jan 02 '18 at 07:39
  • Nice to meet you again, your answer is great, but I have a minor issue that those dynamic question columns is not in sequence, how do I sort them? The Question columns are joint from a table and they are have a seq field. – Alan LI Jan 02 '18 at 07:50
  • bro give me some time i will let you know – Sreenu131 Jan 02 '18 at 08:02
  • @AlanLI Bro check the code changes which i done ,check the result i think that would be your expecting – Sreenu131 Jan 02 '18 at 09:40
  • Thanks for your work. When I run the result, the order like Q1 Q10 Q12 Q2 etc.... Is the new added Seq column in #t no need to use? – Alan LI Jan 02 '18 at 09:53
  • I’m thinking is that possible that looping all columns in the desired order and add to @DyColumn1 and 2, that may solve the order problem – Alan LI Jan 02 '18 at 11:04
0

I don't know how big your data is, or how many questions are possible. A more generic Q&A structure done at the presentation layer would be far better, but for your specific request a more correct design would be a 3NF table. This will allow you to create a primary key that is highly optimised and create a secondary index by question type id. All your keys are now IDs which are far faster to search and match than strings:

Account Codes
AccID - AccName - columns for other data related to accounts

Stores each account you have.

Questions
QuestionID - QuestionName

List of possible questions, one row for every question you have, Q1, Q2 etc. You could add question categories here to exploit any commonality you have, e.g. if you have different surveys with the same set of questions, you could put them in one category and easily then query the below.

Results
AccId, QuestionID, Result, Result Remark

Contains one row for every question asked.

Query for your result still uses pivot, but now you can select the list of columns to use from a variable or dynamic SQL syntax, which means you can control it somewhat better and hte query itself should be better.

With that said, if you have any knowledge about your data whatsoever you can use it to make a static query which can then be indexed. Examples are here of this query: SQL Server 2005 Pivot on Unknown Number of Columns. You can then set the column names if required using the AS syntax, which unfortunately would require dynamic sql again (Change column name while using PIVOT SQL Server 2008).

By the way, what you are trying to do is specifically dealing with denormalised data, which is what nosql is good for, SQL Server gives you great help but you have to have some structure to your data.

If you aren't working for survey monkey and dealing with millions of variations, I'd seriously look at whether you can just make a table specific to each round of questions you get, and then simply denormalise it and add an explicit columns for each question and then make your entire logic just a select * from surveyxyztable where accountid = abc.

Spence
  • 28,526
  • 15
  • 68
  • 103
  • Thanks Spence. My table structures are like your saids, more or less I guess. I have a survey table, a question table, a choices table and a choice supplement table. 1 survey contains many questions, 1 question contain many choices, and 1 choice can have or null supplement input. I can join those tables like your saids, and pivot them using the method like the answer I marked. It done 99% for me and now just sorting problem. The project is not large but I don’t want to specify one table to one survey as each time it need to modify it requires dB changes and rebuild the project. – Alan LI Jan 02 '18 at 10:54