2

I want to line up multiple series so that all milestone dates are set to month zero, allowing me to measure the before-and-after effect of the milestone. I'm hoping to be able to do this using SQL server.

You can see an approximation of what I'm starting with at this data.stackexchange.com query. This sample query returns a table that basically looks like this:

+------------+-------------+---------+---------+---------+---------+---------+
|  UserID    |  BadgeDate  | 2014-01 | 2014-02 | 2014-03 | 2014-04 | 2014-05 |
+------------+-------------+---------+---------+---------+---------+---------+
|    7       | 2014-01-02  | 232     | 22      | 19      | 77      | 11      |
+------------+-------------+---------+---------+---------+---------+---------+
|    89      | 2014-04-02  | 345     | 45      | 564     | 13      | 122     |
+------------+-------------+---------+---------+---------+---------+---------+
|    678     | 2014-03-11  | 55      | 14      | 17      | 222     | 109     |
+------------+-------------+---------+---------+---------+---------+---------+
|    897     | 2014-03-07  | 234     | 56      | 201     | 19      | 55      |
+------------+-------------+---------+---------+---------+---------+---------+
|    789     | 2014-02-22  | 331     | 33      | 67      | 108     | 111     |
+------------+-------------+---------+---------+---------+---------+---------+
|    989     | 2014-01-09  | 12      | 89      | 97      | 125     | 323     |
+------------+-------------+---------+---------+---------+---------+---------+

This is not what I'm ultimately looking for. Values in month columns are counts of answers per month. What I want is a table with counts under relative month numbers as defined by BadgeDate (with BadgeDate month set to month 0 for each user, earlier months set to negative relative month #s, and later months set to positive relative month #s).

Is this possible in SQL? Or is there a way to do it in Excel with the above table?

After generating this table I plan on averaging relative month totals to plot a line graph that will hopefully show a noticeable inflection point at relative month zero. If there's no apparent bend, I can probably assume the milestone has a negligible effect on the Y-axis metric. (I'm not even quite sure what this kind of chart is called. I think Google might have been more helpful if I knew the proper terms for what I'm talking about.)

Any ideas?

Community
  • 1
  • 1
samthebrand
  • 3,020
  • 7
  • 41
  • 56

1 Answers1

2

This is precisely what the aggregate functions and case when ... then ... else ... end construct are for:

select
     UserID
    ,BadgeDate
    ,sum(case when AnswerDate = '2014-01' then 1 else 0 end) as '2014-01'
    -- etc.
group by
     userid
    ,BadgeDate

The PIVOT clause is also available in some flavours and versions of SQL, but is less flexible in general so the traditional mechanism is worth understanding.

Likewise, the PIVOT TABLE construct in EXCEL can produce the same report, but there is value in maximally aggregating the data on the server in bandwidth competitive environments.

Pieter Geerkens
  • 11,775
  • 2
  • 32
  • 52
  • Sorry I may not have been clear. The table provided above is what I already have, not what I'm hoping to return. I've edited the question to emphasize this fact. – samthebrand Sep 24 '14 at 04:20
  • 1
    @SamtheBrand: Okay. You can UNPIVOT, perform the date-difference by month, and then pivot out again in successive sub-queries. Aaron Bertrand has a script here (http://www.mssqltips.com/sqlservertip/2783/script-to-create-dynamic-pivot-queries-in-sql-server/) for dynamic pivot query generation. Itzik Ben-Gan has published an elegant UNPIVOT using CROSS APPLY but I cannt find the reference just now. – Pieter Geerkens Sep 24 '14 at 04:31
  • @SamtheBrand: Source for Itzik's CROSS APPLY UNPIVOT can be found here: http://www.sql.co.il/books/source_code/Advanced%20T-SQL.txt. Search for the string `-- *** Unpivoting` to locate the exampls SQL. – Pieter Geerkens Sep 24 '14 at 04:36