I'm having trouble writing a check that will find missing records.
My tables are Provider, ProviderRelationship, Agreement, and AgreementProvider. I need to find missing records where the Provider has a ProviderRelationship, but no record for AgreementProvider for that specific time frame.
Both ProviderRelationship and AgreementProvider have StartDate and EndDate fields. Providers can only have 1 current ProviderRelationship, but they can have multiple ProviderRelationship entries. For example, ProviderA has ProviderRelationship1 on 3/17/2019, then ProviderRelationship2 from 3/18/2019 to 6/30/2020 and AgreementProvider2 for the same dates.
The closest I've come is this, which gets Providers that have never had any AgreementProvider entries at all. When I add in the date validation, it also includes the previous ProviderRelationships.
CREATE TABLE dbo.Provider
ProviderID INT IDENTITY(1,1) NOT NULL,
Name VARCHAR(200) NULL
ProviderID Name
9003055 ABC
6102 DEF
2743 Desired
9999 Ideal
CREATE TABLE dbo.ProviderRelationship
ProviderRelationshipID INT IDENTITY(1,1) NOT NULL,
ParentProviderID INT NOT NULL,
ProviderID INT NOT NULL,
StartDate DATE NOT NULL,
EndDate DATE NOT NULL
ProviderRelationshipID ParentProviderID ProviderID StartDate EndDate
1 9003055 9003055 2017-03-01 2017-03-27 --providers who are the parent provider should not appear
2 1021 9003055 2017-03-27 2017-03-27
3 1021 9003055 2017-03-28 2100-01-01 --should not appear
4 184 6102 2015-07-01 2015-07-01
5 6102 6102 2015-07-02 2100-01-01
6 244 2743 2015-07-01 2100-01-01 --there is no AgreementProviderID record for this one
7 1234 9999 2018-08-01 2019-09-01
8 4321 9999 2019-10-01 2100-01-01 --this is the ideal result, since there is a gap in AgreementProvider records
CREATE TABLE dbo.Agreement
AgreementID INT IDENTITY(1,1) NOT NULL
ProviderID INT NULL, --this is the ParentProviderID
RefRegionID INT NULL
AgreementID ProviderID
1 1021
2 1021
3 184
4 184
5 184
6 184
7 244
8 244
9 244
10 1234
11 4321
CREATE TABLE dbo.AgreementProvider
AgreementProviderID INT IDENTITY(1,1) NOT NULL
AgreementID INT NULL,
ProviderID INT NULL,
StartDate DATE NULL,
EndDate DATE NULL
AgreementProviderID AgreementID ProviderID StartDate EndDate
1 1 9003055 2017-03-28 2020-06-30
2 2 9003055 2017-03-28 2020-06-30
3 10 1234 2018-08-01 2020-06-30
SELECT DISTINCT o.*
FROM Provider p
JOIN dbo.ProviderRelationship prel ON prel.ProviderID = p.ProviderID
JOIN dbo.Agreement a ON a.ProviderID = prel.ParentProviderId
JOIN dbo.AgreementProvider ap ON ap.AgreementID = a.AgreementID
WHERE prel.ProviderID <> prel.ParentProviderId
AND p.ProviderID NOT IN (SELECT ap2.ProviderID FROM dbo.AgreementProvider ap2
JOIN dbo.ProviderRelationship prel2 ON prel2.ProviderID = ap2.ProviderID
WHERE prel.ParentProviderId = prel2.ParentProviderId)
It should not compare the date to a previous date, so ProviderID 9003055 should not appear. It should also not consider ProviderID when the ParentProviderID matches. It should get both Desired and Ideal ProviderIDs: Desired does not have a record, and Ideal does not have a current record for the current ProviderRelationship.
How would I rewrite this to include the specific time frame for each ProviderRelationship compared to the AgreementProvider information for those dates?
Ideal output would be this:
ProviderID ParentProviderID RelationshipStartDate RelationshipEndDate
2743 244 2015-07-01 2100-01-01
9999 4321 2019-10-01 2100-01-01
Maybe it's already doing what I want, because there are invalid entries for the ProviderIDs that I don't want included. Maybe this is good enough, I will have to confirm.