3

I have a table which stores attendances of students for each day. I need to get students who are consecutively absent for 3 days. However, the dates when attendance is taken is not in a order, some days like nonattendance, holidays, weekends are excluded. The dates when students attended are the dates where records exist in that table .

The data is like

StudentId       Date           Attendance
-----------------------------------------
178234          1/1/2017          P
178234          5/1/2107          A
178234          6/1/2107          A
178234          11/1/2107         A
178432          1/1/2107          P
178432          5/1/2107          A
178432          6/1/2107          P
178432          11/1/2107         A

In the above case the result should be

StudentId        AbsenceStartDate     AbsenceEndDate     ConsecutiveAbsences
----------------------------------------------------------------------------
178234           5/1/2017             11/1/2017           3

I have tried to implement this solution Calculating Consecutive Absences in SQL However that only worked for dates in order only. Any suggestions will be great, thanks

Community
  • 1
  • 1
user1375481
  • 309
  • 1
  • 8
  • 19
  • [xkcd PSA ISO 8601](https://xkcd.com/1179/) – SqlZim Feb 25 '17 at 14:01
  • The dates when students attended are the dates where records exist in that table – user1375481 Feb 25 '17 at 14:01
  • So to confirm, all dates when a student should have attended are recorded in the table, and it is the `Attendance` column which tells you whether they were present (P) or absent (A) on that day? – Skippy Feb 25 '17 at 14:05
  • this might help http://stackoverflow.com/questions/14953294/how-to-get-running-sum-of-a-column-in-sql-server – Randy Feb 25 '17 at 14:15

2 Answers2

7

Oh, you have both absences and presents in the table. You can use the difference of row_numbers() approach:

select studentid, min(date), max(date)
from (select a.*,
             row_number() over (partition by studentid order by date) as seqnum,
             row_number() over (partition by studentid, attendance order by date) as seqnum_a
      from attendance a
     ) a
where attendance = 'A'
group by studentid, (seqnum - seqnum_a)
having count(*) >= 3;

The difference of row numbers gets consecutive values that are the same. This is a little tricky to understand, but if you run the subquery, you should see how the difference is constant for consecutive absences or presents. You only care about absences, hence the where in the outer query.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
1

try this:

declare @t table (sid int, d date, att char(1))
insert @t (sid,d, att) values
(178234, '1/1/2017','P'),
(178234, '5/1/2017','A'),
(178234, '6/1/2017','A'),
(178234, '11/1/2017','A'),
(178432, '1/1/2017','P'),
(178432, '5/1/2017','A'),
(178432, '6/1/2017','P'),
(178432, '11/1/2017','A')

Select s.sid, Min(s.d) startDt, Max(e.d) endDt, s.att, e.att, count(*)
from @t s join @t e on e.d <=  
   (select max(d) from @t m
    Where sid = s.sid
       and d > s.d
       and att = 'A'
       and not exists 
          (Select * from @t
           where sid = s.sid
              and d between s.d and m.d
              and att = 'P'))
Where s.att = 'A' 
   and s.d = (Select Min(d) from @t
              Where sid = s.sid
                and d < e.d
                and att = 'A')
group by s.sid, s.d, s.att, e.att

this is also tricky to explain: basically, it joins the table to itself using aliases s (for start) and e (for end), where the s-row is the first row in a set of contiguous absences, and the e. rows are all following absences that are before the next date where the stud is present. This will generate a set of all the 'A' that do not have a P row within them. Then the sql groups by the appropriate values to return the earliest and latest date, and the count of rows, in each group.
The last where clause ensures that the s row is the first row in the group.

Charles Bretana
  • 143,358
  • 22
  • 150
  • 216