0

I want to do Data Analysis on some Stack Overflow Posts and need to get a query output in the right format. My goal is to input Post ID's and get my answers in the following format:

ID|Title|Question|Answer1|Answer2|Answer3|Answer4|Answer5|Answer...
__________________________________________________________________
1 |Tit 1|Quest 1 |1.Answ |2.Answ |3.Answ |4.Answ |5.Answ |Answer...
2 |Tit 2|Quest 2 |1.Answ |2.Answ |3.Answ |       |       |
3 |Tit 3|Quest 3 |1.Answ |2.Answ |3.Answ |4.Answ |       |

I am not familiar with writing queries on StackExchange but i managed to write a query to get almost the right output. My results is like this:

ID|Title|Question|Answer|
_________________________ 
1 |Tit 1|Quest 1 |1.Answ |
1 |Tit 1|Quest 1 |2.Answ |
1 |Tit 1|Quest 1 |3.Answ |
2 |Tit 2|Quest 2 |2.Answ |
2 |Tit 2|Quest 2 |2.Answ |
2 |Tit 2|Quest 2 |2.Answ |

As you can see i duplicate the Id,Title and Question for each answer. And the answers are in a column and not side by side. This is the query i managed to write. Can somebody help me with that or point me in the right direction?

select
    p.Id, p.Title, p.Body, k.Body
from
    Posts as p inner join
    Posts as k on
p.id = k.parentid
where
p.Id in (##id##) and k.posttypeid=2
Laurel
  • 5,965
  • 14
  • 31
  • 57
SK55
  • 1
  • If i understand it correctly my main problem is, that in the StackExchange Database Schema questions and answers are stored in the same table and the column is called body. The only thing that is different, that the questions have the posttypeid = 1 and the answers posttypeid = 2. – SK55 Oct 03 '15 at 18:15

1 Answers1

0

You'll want to PIVOT your table to turn the rows into columns.

Check out this article about Pivots. The downside is that you need to hard code each possible answer which you don't know how many there will be (Answer1, Answer2, ...).

Using STUFF to put it in one column - something like:

SELECT Id, Title, Q, 
STUFF(
(
    SELECT '|'+ Body FROM POSTS WHERE Id = t.Id FOR XML path('') ), 1, 1, '') 
FROM (
SELECT p.Id, p.Title, p.Body AS Q, k.Body AS ANS
FROM Posts as p 
INNER JOIN Posts AS k ON
p.id = k.parentid
WHERE p.Id in (##id##) AND k.posttypeid=2 ) t
Hannover Fist
  • 10,393
  • 1
  • 18
  • 39
  • unfortunately i have about 15.000 post ids that i want to query. So i cant know how many answers each question has. Do you have any other ideas? Does it get easier/is it possible if i try to get all answers (Answer1,Answer2,..AnswerN) in one cell. Thx for your answer – SK55 Oct 01 '15 at 17:30
  • If you want all answers in one column, you can use STUFF - http://stackoverflow.com/questions/21623593/what-is-the-meaning-of-select-for-xml-path-1-1 . – Hannover Fist Oct 01 '15 at 17:43
  • Thx for your answer. I tried to work with it, but didn't fix my problem yet. Using your query i get this error message: The column 'Body' was specified multiple times for 't'. The multi-part identifier "p.Id" could not be bound. The multi-part identifier "p.Title" could not be bound. The multi-part identifier "p.Body" could not be bound. The multi-part identifier "k.Body" could not be bound. – SK55 Oct 03 '15 at 17:36
  • Sorry - I wrote that off the top of my head and I forgot to alias the columns (the 2 body columns were causing the conflict). – Hannover Fist Oct 05 '15 at 15:28
  • Thx again, for your response. The Error is shorter now: The multi-part identifier "k.Body" could not be bound. i will try to get fixed it on my own, but chances are bad :) – SK55 Oct 05 '15 at 20:46
  • Ahh - just remove the **k.** from `select '|'+ k.Body`. I'll update the answer. – Hannover Fist Oct 05 '15 at 20:51