2

I apologize in advance - I'm not a DB expert, so there may be a very obvious answer to this that I've missed.

I have a query that counts number of consecutive check-ins by users of a particular status. There are 5 check-in statuses, and we need to count the consecutive check-ins of status 'U' (there are 5 possible statuses: 'U', 'O', 'D', 'F', 'M'). If someone checks in with any other status, the streak stops. After much learning, testing, and failing, I came up with the following query to return the longest streak for any user based primarily on this stackoverflow thread and this stack overflow thread (thanks to authors of those posts/solutions!).

SELECT tmi, static_streak, MAX(count) AS consec
FROM (
    SELECT @prev_tmi := @tmi AS prev_tmi,
            @prev_s := @status AS prev_s,
            @tmi := rtmc.team_member_id AS tmi,
            @status := rtmc.status AS status,
            @count := if(@prev_s = @status AND @status = 'U', @count + 1, 1) AS count
    FROM readiness_team_member_checkins rtmc
    CROSS JOIN (SELECT @count:=0) var_init
    LEFT JOIN readiness_sessions rs ON rtmc.readiness_sessions_id = rs.readiness_sessions_id 
    WHERE rtmc.team_member_id = 83
    ORDER BY rtmc.readiness_sessions_id, tmi
)
AS sub 
GROUP BY tmi

Here are the tables/test data:

CREATE TABLE IF NOT EXISTS `readiness_sessions` (
  `readiness_sessions_id` int(11) NOT NULL AUTO_INCREMENT,
  `session_date` date NOT NULL,
  `session_level` varchar(3) DEFAULT NULL,
  PRIMARY KEY (`readiness_sessions_id`,`session_date`)
);

INSERT INTO `readiness_sessions` (`readiness_sessions_id`,
                                  `session_date`, 
                                  `session_level`
                                  ) VALUES
(1, '2015-09-02', '4'),
(2, '2015-09-03', '4'),
(6, '2015-09-04', '4'),
(7, '2015-09-05', '4'),
(8, '2015-09-06', '4'),
(10, '2015-09-07', '4'),
(11, '2015-09-07', '3'),
(12, '2015-09-08', '4'),
(13, '2015-09-09', '4'),
(14, '2015-09-29', '4'),
(15, '2015-09-30', '4'),
(16, '2015-10-01', '3'),
(17, '2015-10-02', '4'),
(18, '2015-10-06', '4'),
(19, '2015-10-20', '4');

CREATE TABLE IF NOT EXISTS `readiness_team_member_checkins` (
  `readiness_team_member_checkins_id` int(11),
  `readiness_sessions_id` int(11),
  `readiness_team_checkins_id` int(11),
  `team_member_id` int(11),
  `status` enum('U','D','O','M','F','X')

);

INSERT INTO `readiness_team_member_checkins` 
  (`readiness_team_member_checkins_id`, 
   `readiness_sessions_id`, 
   `team_member_id`, 
   `status`) VALUES
(1, 1, 83, 'U'),
(2, 2, 83, 'O'),
(3, 6, 83, 'U'),
(4, 7, 83, 'U'),
(5, 8, 83, 'U'),
(6, 10, 83, 'M'),
(7, 11, 83, 'U'),
(8, 12, 83, 'U'),
(9, 13, 83, 'U'),
(10, 14, 83, 'U'),
(11, 15, 83, 'U'),
(12, 16, 83, 'D'),
(13, 17, 83, 'U'),
(14, 18, 83, 'U'),
(15, 19, 83, 'U');

I now have an added feature that requires a second status to be considered in certain circumstances.

The requirement for the new process basically states that a check-in status of 'U' or 'D' will not break the consecutive counting but only 'U' will increment the count - 'D' will be ignored but not break the consecutive counting. I've tried massaging the above query with additional user-defined variables, but I'm rather unfamiliar with this level of mysql.

The included data should return 5 for the current implementation, and would return 8 for the additional requirement.

I have two questions:

  1. The current query usually returns the correct data, but sometimes it will return an incorrect number and I need to reload the query and will get the right number. I hypothesize that this is due to the user-defined variables, which I basically learned to use last week. Any hints on this? I've been unable to reproduce the problem consistently...

  2. Any hints on how I can adapt the above cleanly to get the longest consecutive streak of 'U' not interrupted by 'D' as described in the secondary requirement?

SQL Fiddle here

Community
  • 1
  • 1
hcexile
  • 446
  • 7
  • 22

3 Answers3

1

I update your sqlfiddle with the data in your question

@count := if(condition1, true1, false1)

condition1 = ( @prev_s = 'U' OR @prev_s = 'D') <- mean i will continue chain
true1 = if(condition2, true2, false2)
false1 = 1    <- start new chain

condition2 = (@status = 'U')
true2  = @count + 1   <- increase count
false2 = if(condition3, true3, false3)

condition3 = (@status = 'D')
true3 = @count   <- keep the chain count
false3 = 1       <- not 'U' or 'D' reset count

SQL FIDDLE DEMO

SELECT tmi, MAX(count) AS consec
    FROM (
      SELECT @prev_tmi := @tmi AS prev_tmi,
          @prev_s := @status AS prev_s,
          @tmi := rtmc.team_member_id AS tmi,
          @status := rtmc.status AS status,
          @count := if(@prev_s = 'U' OR @prev_s = 'D',
                       if(@status = 'U',
                          @count + 1, 
                          if(@status = 'D', @count, 1)
                         ),
                       1
                      ) AS count
      FROM readiness_team_member_checkins rtmc
      CROSS JOIN (SELECT @count:=0) var_init
      LEFT JOIN readiness_sessions rs ON rtmc.readiness_sessions_id = rs.readiness_sessions_id 
      WHERE rtmc.team_member_id = 83 AND rs.session_level in (3,4)
      ORDER BY rtmc.readiness_sessions_id, tmi
    )
AS sub 
GROUP BY tmi

partial output

| prev_tmi | prev_s | tmi | status | count |
|----------|--------|-----|--------|-------|
|       83 | (null) |  83 |      U |     1 |
|       83 |      U |  83 |      O |     1 |
|       83 |      O |  83 |      U |     1 |
|       83 |      U |  83 |      U |     2 |
|       83 |      U |  83 |      U |     3 |
|       83 |      U |  83 |      M |     1 |
|       83 |      M |  83 |      U |     1 |
|       83 |      U |  83 |      U |     2 |
|       83 |      U |  83 |      U |     3 |
|       83 |      U |  83 |      U |     4 |
|       83 |      U |  83 |      U |     5 |
|       83 |      U |  83 |      D |     5 |  <- skip count / no break chain
|       83 |      D |  83 |      U |     6 |
|       83 |      U |  83 |      U |     7 |
|       83 |      U |  83 |      U |     8 |
Juan Carlos Oropeza
  • 47,252
  • 12
  • 78
  • 118
  • This looks like it worked well. I added: `SELECT max(consec) as maxConsec FROM ( ... ) b;` to only return the highest, which is all I needed. Currently validating on a much bigger dataset! – hcexile Oct 29 '15 at 00:03
1

The following query counts all the sequences for a given member:

SELECT team_member_id, grp, count(*) AS consec
FROM (SELECT rtmc.*,
             (@grp := if(@ms = concat_ws(':', rtmc.team_member_id, rtmc.status), @grp,
                         if(@ms := concat_ws(':', rtmc.team_member_id, rtmc.status), @grp + 1, @grp + 1)
                         )
             ) as grp
      FROM readiness_team_member_checkins rtmc CROSS JOIN
           (SELECT @grp := 0, @ms := '') params  
      WHERE rtmc.team_member_id = 83
      ORDER BY rtmc.readiness_sessions_id, team_member_id
     ) s
GROUP BY team_member_id, grp;

You can add order by count(*) desc limit 1 if you only want the longest.

Notes about using variables:

  • Don't assign variables and use them in different expressions in the same select. The MySQL documentation explicitly warns against this; this is probably why your query works sometimes but not always.
  • It is a good practice to initialize all the variables in the SQL statement. I do so in a subquery called params.
  • The table readiness_sessions doesn't seem to be used in the query so I removed it.
  • To get only one status, add where status = 'U' in the outer query.
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Hello Gordon. Your query return lenght of chains. But OP say want the lenght of `U` chain, but when 'D' doesnt break the chain. Of course if you add the 'U' variable this will solve the original question better. But not the new ones. – Juan Carlos Oropeza Oct 27 '15 at 21:57
  • nevermind, I just realize you solve first question but still need add the 'U' – Juan Carlos Oropeza Oct 27 '15 at 21:58
  • @Xmy I just have to read this http://www.xaprb.com/blog/2006/12/15/advanced-mysql-user-variable-techniques/ to understand a litle more about variables. – Juan Carlos Oropeza Oct 27 '15 at 22:41
1

after reading Gordon answer for part 1, I got another version for part 2

I create a dummy column to join 'D', 'U' as a single group

And replace a COUNT() for a SUM()

SqlFiddleDemo

SELECT team_member_id, grp, status, SUM(CASE WHEN status = 'U' THEN 1 ELSE 0 END ) AS consec
FROM (SELECT rtmc.*,
             @ms,
             (@grp := if(@ms = concat_ws(':', rtmc.team_member_id, rtmc.dummy_status), @grp,
                         if(@ms := concat_ws(':', rtmc.team_member_id, rtmc.dummy_status), @grp + 1, @grp + 1)
                         )
             ) as grp
      FROM (SELECT rtmc.*, CASE 
                               WHEN status = 'D' THEN 'U'
                               ELSE status
                           END as dummy_status
            FROM 
            readiness_team_member_checkins rtmc 
            ) rtmc
      CROSS JOIN
           (SELECT @grp := 0, @ms := '') params  
      WHERE rtmc.team_member_id = 83
      ORDER BY rtmc.readiness_sessions_id, team_member_id
     ) s
GROUP BY team_member_id, grp, status;

OUTPUT

| team_member_id | grp | status | consec |
|----------------|-----|--------|--------|
|             83 |   1 |      U |      1 |
|             83 |   2 |      O |      0 |
|             83 |   3 |      U |      3 |
|             83 |   4 |      M |      0 |
|             83 |   5 |      U |      8 |
|             83 |   5 |      D |      0 |
Juan Carlos Oropeza
  • 47,252
  • 12
  • 78
  • 118