Count criteria: A "true" count, by month-year, of the number of students (indexed by a unique Student ID) at each School by School Service Type.
I am currently using MS-Access. I need to be able to extract counts for a database with similar data as the sample data below. Each row is a payment observation.
Sample data:
Student ID | School | School Service Type | PaymentStartDate | PaymentEndDate |
001 ABC ED 01/02/2010 02/04/2012
001 ABC ED 01/02/2010 01/05/2010
001 ABC ED 04/02/2010 05/05/2010
001 DEF EZ 01/02/2010 02/04/2012
001 RR 02/02/2012 02/03/2012
002 ABC ED 02/02/2010 02/03/2011
002 ABC EZ 02/02/2010 06/03/2010
002 GHI ED 02/04/2011 02/04/2012
003 ABC ED 02/02/2011 02/03/2012
003 DEF ED 01/02/2010 08/03/2010
003 RR 02/02/2011 02/03/2011
004 RR 02/02/2011 02/03/2011
005 GHI ED 08/02/2010 02/04/2011
006 GHI ED 08/02/2010 08/02/2010
What I want my count to look like:
Month | Year | School | ED | EZ | RR |
01 2010 ABC 1 0 0
01 2010 DEF 1 1 0
01 2010 GHI 0 0 0
02 2010 ABC 2 1 0
03 2010 ABC 2 1 0
Example
Desire: I want a true count of students, across schools, by service type, for January 2010
Biggest hurdle - Duplicates
Please take a look at the observations for Student ID 001
. Between January 2010 and February 2012, School ABC
received 3 payments for the exact same Service Type (ED)
for Student 001.
I want my January 2010 count of students that received ED
services at school ABC
to return a value of 1, since only one student (Student 001
) received services for ED
at that school for that time.
However, when I conduct a crosstab to retrieve my counts with this data as it is, it returns a value of 2 (for two students). The reason being that both payment #1 and payment #2 of Student 001
meet my month-year criteria of January 2010.
Payment #1 meets the criteria because January 2010 lies within the payment date range of 01/02/2010 - 02/04/2012*.
Payment #2 also meets the criteria because January 2010 lies within the payment date range of 01/02/2010 - 01/05/2010.
Payment #3 does not meet the criteria because January 2010 is not within the date range for that row (04/02/2010 - 05/05/2010).
*You get these dates from the PaymentStartDate
and PaymentEndDate
respectively.
I prepared an Excel version of the sample data here:Link to Excel file
Please keep in mind that:
The time elapsed between the PaymentStartDate and PaymentEndDate is extremely variable across the board, and ranges from 0 days to 122 days.
Many times, there are truly unique payment observations where the time lapsed between PaymentStartDate and PaymentEndDate is 0 days (take a look at the payment row for Student ID 006 in the data above). Therefore, getting rid of rows that do not meet a specified "time lapsed between PaymentStartDate and PaymentEndDate" criteria is not an option, since many times they are not the duplicates I am trying to get rid of for my counts.
Yes, for certain Service Types, there is no School value.
As always, any helpful ideas of advice on how I can resolve this duplicates issue, and retrieve my true count values within MS-Access are greatly appreciated. Thank you for your time.
Edit (02/10/2014): Changed the count output above to reflect the sample data I provided in my post. My apologies for not doing so before.