29

I've a simple table in sql server 2005 with 3 columns: DateStart, DateEnd and Value. I tried to set a "table check constraint" to avoid inserting overlapping records. For instance if in such table there is a record with DateStart = 2012-01-01 (first January) and DateEnd 2012-01-15 (15th January) than Check constraint must avoid inserting a record with DateStart=2012-01-10 (no care DateEnd), a record with DateEnd=2012-01-10 (no care DateStart) or a record with DateStart 2011-12-10 and DateEnd 2012-02-01.

I defined a UDF in such way:

CREATE FUNCTION [dbo].[ufn_checkOverlappingDateRange]
(
    @DateStart AS DATETIME
    ,@DateEnd AS DATETIME
)
RETURNS BIT 
AS
BEGIN
  DECLARE @retval BIT
  /* date range at least one day */
  IF (DATEDIFF(day,@DateStart,@DateEnd) < 1)
    BEGIN
      SET @retval=0
    END
  ELSE
    BEGIN
      IF EXISTS
        (
          SELECT
              *
            FROM [dbo].[myTable]
            WHERE
            ((DateStart <= @DateStart) AND (DateEnd > @DateStart))
            OR
            ((@DateStart <= DateStart) AND (@DateEnd > DateStart))
        )
        BEGIN
          SET @retval=0
        END
    ELSE
      BEGIN
            SET @retval=1
          END
        END
  RETURN @retval
END

Then thought check could be this:

ALTER TABLE [dbo].[myTable]  WITH CHECK ADD  CONSTRAINT [CK_OverlappingDateRange] CHECK  ([dbo].[ufn_checkOverlappingDateRange]([DateStart],[DateEnd])<>(0))

But even with [myTable] empty EXISTS Operator returns true when i insert first record. Where i'm wrog ? Is it possible to set a constraint like this ?

BTW I consider DateStart includes in range and DateEnd excludes from range.

Antonio Manello
  • 464
  • 1
  • 6
  • 12

3 Answers3

38

The CHECK is being executed after the row has been inserted, so the range overlaps with itself.

You'll need to amend your WHERE to include something like: @MyTableId <> MyTableId.


BTW, your WHERE expression can be simplified.

Ranges don't overlap if:

  • end of the one range is before the start of the other
  • or start of the one range is after the end of the other.

Which could be written in SQL like:

WHERE @DateEnd < DateStart OR DateEnd < @DateStart

Negate that to get the ranges that do overlap...

WHERE NOT (@DateEnd < DateStart OR DateEnd < @DateStart)

...which according to De Morgan's laws is the same as...

WHERE NOT (@DateEnd < DateStart) AND NOT (DateEnd < @DateStart)

...which is the same as:

WHERE @DateEnd >= DateStart AND DateEnd >= @DateStart

So your final WHERE should be:

WHERE
    @MyTableId <> MyTableId
    AND @DateEnd >= DateStart
    AND DateEnd >= @DateStart

[SQL Fiddle]

NOTE: to allow ranges to "touch", use <= in the starting expression, which would produce > in the final expression.

Branko Dimitrijevic
  • 50,809
  • 10
  • 93
  • 167
  • Thank you Branko for your answer (also with fiddle :-)) with very very clear explanation. Do you think a could use DateStart as primary key instead of MyTableId or it is not a good idea ? – Antonio Manello Aug 21 '12 at 07:42
  • @AntonioManello If two `DateStart`s are equal, the corresponding ranges would overlap, which is forbidden by your business rules. So, `DateStart` must be unique, making it a key. Whether it should be **primary** key is a different matter: Do ranges ever change and do you have FKs towards other tables? If yes, then the surrogate PK would cut the ON UPDATE CASCADE. If no, you might just as well make `DateStart` the primary key. In fact, this might be beneficial for [clustering](http://use-the-index-luke.com/sql/clustering/index-organized-clustered-index). – Branko Dimitrijevic Aug 21 '12 at 09:21
  • 1
    Thinking about avoid interval overlapping *only* for same id ( not a key but a value from other column) I mean, not to avoid overlap at table level but just for some entries that belongs to same ID.. is it possible? – Hernán Eche Dec 14 '12 at 16:09
  • @HernánEche I don't know what "ID" means in the context above, but I don't see any particular reason why you shouldn't be able to add whatever restriction you need to the WHERE condition above. You'd probably need to pass that ID as the function parameter, similarly to what is already done with `MyTableId`, `DateStart` and `DateEnd`. – Branko Dimitrijevic Dec 14 '12 at 16:18
  • you are right if ID is not a key, it's just to put WHERE `@MyTableId = MyTableId`.. thanks – Hernán Eche Dec 14 '12 at 17:10
  • 3
    this approach won't be consistent when using READ COMMITTED SNAPSHOT isolation though? Since each transaction get their own snapshot at the start of the transaction which may lead to both transactions seeing they've ended up with a valid state (ie transaction B won't see that transaction A has already added a row that will cause overlap) – neo112 Jul 18 '16 at 11:35
  • 2
    @neo112 Yes. And race conditions exist for non-SNAPSHOT isolation too. This can be solved by locking the entire table (which has obvious scalability consequences), or *very* carefully locking individual rows (not in the original table but in a special table of predefined date ranges) . Thanks for pointing the problem out - I probably should have mentioned these complications in my answer... – Branko Dimitrijevic Jul 20 '16 at 10:08
  • Hey @BrankoDimitrijevic, in the first sentence in your answer "The CHECK is being executed after the row has been inserted", where did you happen to read this, or how were you able to check that this is really what is happening? I'm reading all of the documentation I can about CHECK constraints, and I am unable to find any mention of it. – Leonardo Lopez Oct 12 '19 at 23:14
  • 1
    @LeonardoLopez This is not a documented behavior, as far as I'm aware, so you should not rely on it (where possible, of course) - it's best to write CHECKs to be agnostic of when they are actually evaluated. That being said, it seems this has been a long-standing behavior in SQL Server, and you can see it directly in the execution plan: the `Asset` node will always be above nodes such as `Clustered Index Insert` or `Clustered Index Update`, meaning it is executed later. – Branko Dimitrijevic Oct 14 '19 at 12:37
  • 1
    @LeonardoLopez You can also see it from a debugger: break in the function under the CHECK, and then use SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED from another session and query the table - the new data will already be there. – Branko Dimitrijevic Oct 14 '19 at 12:37
1

CREATE TABLE [dbo].[TEMPLATE] (
    [ID]             BIGINT       IDENTITY (1, 1) NOT NULL,
    [DATE_START]     DATETIME     NOT NULL,
    [DATE_END]       DATETIME     NOT NULL,
    [TEMPLATE_NAME]  VARCHAR (50) NOT NULL,
    CONSTRAINT [PK_TEMPLATE] PRIMARY KEY CLUSTERED ([ID] ASC),
    CONSTRAINT [FK_current_start_and_end_dates_in_sequence] CHECK ([DATE_START]<=[DATE_END])
);

go


CREATE FUNCTION [dbo].[Check_Period]
(
    @start DateTime,
    @end DateTime
)
RETURNS INT
AS
BEGIN
    declare @result INT = 0 ;
    set @result = (Select count(*) from [dbo].[TEMPLATE] F where F.DATE_START <= @start and F.DATE_END >= @start );

    set @result = @result +  
    (
         Select count(*) from [dbo].[TEMPLATE] F where F.DATE_START <= @end and F.DATE_END >= @end 
    ) 
    RETURN @result
END

go

ALTER TABLE [dbo].[TEMPLATE]
    ADD CONSTRAINT [FK_overlap_period_t]
    CHECK ([dbo].[Check_Period]([DATE_START],[DATE_END])=(2));

go

Insert Into [dbo].[TEMPLATE] (DATE_START, DATE_END, TEMPLATE_NAME) values ('2020-01-01','2020-12-31', 'Test1');
-- (1 row(s) affected)
Insert Into [dbo].[TEMPLATE] (DATE_START, DATE_END, TEMPLATE_NAME) values ('2021-01-01','2022-12-31', 'Test2');
-- (1 row(s) affected)
Insert Into [dbo].[TEMPLATE] (DATE_START, DATE_END, TEMPLATE_NAME) values ('2020-01-01','2020-12-31', 'Test3');
-- The INSERT statement conflicted with the CHECK constraint "FK_overlap_period_t".
user1785960
  • 565
  • 5
  • 17
0

I'd just like to add onto the Answer of Branko Dimitrijevic the case where the DateEnd is null since I currently have such a scenario.

This can occur when you're keeping punch in logs and the user is still logged in.

WHERE
    @MyTableId <> MyTableId
    AND @DateEnd >= DateStart
    AND DateEnd >= @DateStart
    OR @DateEnd >= DateStart
    AND DateEnd is null
    OR @DateStart >= DateStart
    AND DateEnd is null

I don't know how well this query is performance wise, I'm sure there are ways to optimize it.

Community
  • 1
  • 1
effy
  • 410
  • 8
  • 20