2

I am currently working on a project that needs to utilize data stored in a SQL db in the format shown below.

What I need to do, and am having difficulty finding a tool to help with, is to take this data and create a multi column single record keyed off of the ID and Event Date. While I have found some info on the net about event driven handlers, the problem is that although my labels say "Event" what is really stored is individual pieces of data called "Events" that are really individual entries that often, but not always, are entered at the same time and contribute to a whole event entry. I am sorry if this seems a bit confusing but I am working with a system I have no control over to change I just need to make useful sense of the data.

Here is a running example of what is stored in the db that I need to convert to a single record;

ID      Event           Event Result  Event Result Value            Event Date      Entered By
909909  Status Change       No          no                          10/25/12 23:10  Doe, John
909909  Disposition         Remuni      Remains in place            10/25/12 23:10  Doe, John
909909  Event duration      15mins      15mins                      10/25/12 23:10  Doe, John
909909  Event end time      23:25       23:25                       10/25/12 23:10  Doe, John
909909  Event start time    23:10       23:10                       10/25/12 23:10  Doe, John
909909  Event type          Iniqrs      Initial System activation   10/25/12 23:10  Doe, John

Into the following single record;

ID      Event Start Event End   Moved               Action Taken                Entered By
909909  23:10       23:25       Remains in place    Initial System activation   Doe, John

Thank you in advance for any help you can give.

Chad
  • 7,279
  • 2
  • 24
  • 34
Mark
  • 21
  • 1
  • What if multiple users enter parts of the same "Event"? How do you want to handle that? – Iskar Jarak Oct 31 '12 at 20:22
  • John, in the past for a short version of events possible I did a pretty complex SSIS that used conditional branching and the OLE DB command component to update the record column by column. However I was hoping to create a more dynamic solution that would allow me to use a config/xml/table to reference for the columns I want to build and the events I want to process. Does that make sense? In fact since writing this I have been thinking that a custom ssis component that would be a destination component would be a good solution, albeit a hard one to build initially as I have never done one. – Mark Nov 01 '12 at 18:45

2 Answers2

1

While I agree with Bailey S. that Postgres is 'the best database engine on earth', I also see that you are using SQL Server.

The MSSQL function you need is called a 'Pivot' function and you can find the documentation here.

It can be a bit tricky to set up initially, but it is totally doable and will give you the results you are looking for.


Alternatively, there is a technique called a 'self join' and it looks like this:

SELECT
    [T1].[ID],
    [StatusChange].[EventResultValue] AS 'StatusChange',
    [Disposition].[EventResultValue] AS 'Disposition',
    ...
FROM
    (
        SELECT DISTINCT
            [EventsTable].[ID]
        FROM
            [EventsTable]
    ) AS [T1] LEFT OUTER JOIN
    [EventsTable] AS [StatusChange] ON [T1].[ID] = [StatusChange].[ID] AND [StatusChange].[Event] = 'Status Change' LEFT OUTER JOIN
    [EventsTable] AS [Disposition] ON [T1].[ID] = [Disposition].[ID] AND [Disposition].[Event] = 'Disposition' LEFT OUTER JOIN
    ...

A bit of a warning:
If you use this technique it will call the table each time it is listed in the FROM clause. There could be performance penalties it you need several columns or if the table has a large number of records.


Another option is to use a group & case statement together:

SELECT
    [ID],
    MAX(CASE WHEN [Event] = 'Status Change' THEN [EventResult] ELSE NULL END) AS [Event Start],
    MAX(CASE WHEN [Event] = 'Disposition' THEN [EventResult] ELSE NULL END) AS [Event End]
    ...
FROM
    [EventsTable]
GROUP BY
    [ID]

A bigger bit of warning:
If you use this technique it is guaranteed to return only one record per [ID]. If you have multiple Events of the same type (i.e. the Event start time could appear twice for any given ID, then do not use this as some of those records will not be returned.

losthorse
  • 1,530
  • 1
  • 13
  • 33
  • +1 for the options. One nitpick with your final warning though - the records will all be considered, and the "maximum" one (from a string comparison perspective) will be represented in the results. – Chad Oct 31 '12 at 21:03
  • Would this mean that if someone entered 2 start time events it would choke or one would be dropped? – Mark Nov 01 '12 at 18:39
  • @Mark - in the group/case query as in my answer, it would pick the `MAX` of the two start time events if they had the same `ID` and `Event Date`. losthorse's self join would give you a duplicate result row and I'm not sure how the `pivot` would behave. – Chad Nov 01 '12 at 19:13
  • @ChadHenderson - thanks for the info, I do have a concern about that issue. Right now I am looking into the feasibility of building an SSIS destination component that would allow me to configure what events to process and turn into columns and how to handle duplicates. Any thoughts on going this route? – Mark Nov 01 '12 at 19:18
  • @Mark - unfortunately I'm not familiar with SSIS at all. Best of luck though! – Chad Nov 01 '12 at 19:24
0

There's a trick I like to break out for scenarios like this. The idea is that you want to group your data down to one id. If you use MAX() in conjuction with a CASE you can then basically pick a value from a desired row in that group and turn it into a column in the result set.
Here's the query:

select
  id,
  max(case when Event = 'Event start time' then [Event Result Value]
    else null end) as [Event Start],
  max(case when Event = 'Event end time' then [Event Result Value]
    else null end) as [Event End],
  max(case when Event = 'Disposition' then [Event Result Value]
    else null end) as [Moved],
  max(case when Event = 'Event type' then [Event Result Value]
    else null end) as [Action Taken],
  max([Entered By]) as [Entered By]
from Table1
group by id, [Event Date]

And here you can see it working in action.

Keep in mind that this assumes certain constraints. Namely, that for each id, [Event Date] combination there is only:

  • one unique [Entered By] value
  • one row with Event = 'Event start time'
  • one row with Event = 'Event end time'
  • one row with Event = 'Disposition'
  • one row with Event = 'Event Type'

Note: This query is grouping by Event Date. This means that a record with the same date but different time will be considered as part of a separate group. This will work as desired if you have two sets of records with timestamps of the same day but two different times. On the flip side, you might get burned by having a record excluded from its intended group if its timestamp is slightly off. You just need to make sure that your groups have consistent timestamps.

Chad
  • 7,279
  • 2
  • 24
  • 34
  • Thanks for the info. I am concerned that the issue could occur that there could be more than one if they created a new one at a different time but same date. I haven't worked all that out yet. As I mentioned above I am now leaning towards building something that would allow me to dynamically specify the events/columns I want to process so I am thinking about building a custom destination component. – Mark Nov 01 '12 at 18:48
  • @Mark, I added a note addressing your first concern. – Chad Nov 01 '12 at 19:10
  • One additional question regarding the above technique, is there a way to make the cases pull from a table? What I mean is that I would like to have the events set up in a more dynamic way so that if I have to change the events I am pulling for I don't have to rewrite the query. Thanks. – Mark Nov 02 '12 at 17:54
  • @Mark, that's a bit tricky. You could take a look at something like http://stackoverflow.com/questions/12210692/t-sql-dynamic-pivot for a starting point. – Chad Nov 02 '12 at 18:22