35

So I recently learned about ASSERTION in my databases course, and my prof noted that major databases don't support it, even though it is in the SQL-92 standard. I tried googling to find out why, but there doesn't seem to be any discussion on the topic.

So, why isn't ASSERTION supported by the vast majority of relational database packages? Is it soley a performance issue or is there something intrinsically hard about it?


If you can, please note any database packages that implement it as well (example: if there's an academic/teaching DB). Also, why is there so little discussion on the issue; it's not even mentioned in the Wikipedia page for SQL or SQL-92) But answer the main question first, or answer in comments.

I'm not looking for how to implement it with triggers or anything else.

forivall
  • 9,504
  • 2
  • 33
  • 58
  • 2
    I think it is purely a performance issue - column CHECK constraints only have to deal with rows that are "hot" for a particular query. FOREIGN KEY constraints usually require indexes on the referenced table to enable them to be efficiently implemented. ASSERTION is just too open ended, IMO. – Damien_The_Unbeliever Jun 16 '11 at 07:18
  • 1
    perhaps better on dba.se or programmers.se? or both...? – gbn Jun 16 '11 at 07:42
  • dba.se seems to be more focused on specific topics, this is quite broad. maybe programmers, I didn't think of putting it there; an admin can move it if needed, I don't know if it's a big deal. – forivall Jun 17 '11 at 20:27

7 Answers7

33

There are four levels of constraint: column-level, row-level, table-level and schema-level.

A table-level could, for example, involve a target table other than the source table on which it was declared but only gets checked when the source table changes. In theory a schema-level constraint would be checked for every change in every table in the schema but in practise the optimizer would be able to detect changes in a more granular way; consequently, if your DBMS had support for schema-level constraint then you wouldn't find much use for table-level constraints in practise.

No current SQL product supports schema-level constraints i.e. CREATE ASSERTION. Apparently Rdb did support it when it was looked after by DEC but that is no longer the case. -- UPDATE: in a private message I was advised that Sybase's SQL Anywhere supports CREATE ASSERTION but with serious errors that allow such constraints sometimes to be violated!

The only SQL-like product I've used that currently supports subqueries in CHECK constraints, which enables table-level constraints, is the Access database engine (ACE, Jet, whatever). It has issues, though. First, there is not support for SQL-92 functionality (or equivalent) to defer constraint checking. Second, table-level constraints are checked for each row affected, rather than when the statement completes as required by the SQL-92 Standard. Needless to say, the workaround is very clunky e.g. drop the constraint and in doing so lock the table, execute the update, recreate the constraint. Schema-level constraints, arguably achievable by adding the same constraint to all the tables it involves, is virtually unworkable.

Possibly for these reasons, the Access Team have never publicized its CHECK constraint functionality at all beyond the initial announcements for Jet 4.0 (it remains missing from the Access Help, for example). All that said, for intra-table constraints (e.g. a sequenced key in a valid-state 'history' temporal table) the functionality works well, especially when you consider that Access only got trigger-like functionality (not SQL based, though) last year.

SQL of course has UNIQUE constraints and referential integrity constraints that are of course table-level but these are special cases. Therefore, all constraints you will encounter 'in the wild' will be either colum- or row-level.

Do be aware with MySQL that, although using CHECK() in SQL DDL will parse without error, it will have no effect. How users can tolerate a SQL product with no CHECK constraints at all is beyond me! PostgreSQL has a excellent constraints model, hint hint :)

So why are inter-table constraints so rarelt supported? One reason must be due to historical circumstances. As @gbn correctly identifies (under the title Concurrency), the Sybase/SQL Server family of SQL implementations is based on a model that cannot cope with inter-table constraint checking and that's not something that is likely to ever change.

Consider looking at this the other way around: if you were creating a SQL product today, would you include CREATE ASSERTION? If you did, you would certainly have to also implement DEFERRABLE constraints (even though multiple assignment is arguably the better model). But you would be able to draw on a lot more research and experience if you went down the route of building a 'traditional' optimizer. And perhaps you'd find there is no commercial demand for schema-level constraints (if MySQL can get anyway without CHECK constraints...) If PostgreSQL doesn't do it, I don't think anyone ever will.

I think the real show stopper is that most industrial-strength products have already developed trigger functionality that allows users to write 'constraints' of arbitrary complexity (plus can a lot more e.g. send an email to tell something happened). Sure, they are procedural rather than declarative, the coder has to do a lot of extra work that the system would take care of with true constraints, and the performance tends to be not so great. But the fact is they existing in real products today and do provide a 'get out of jail free card' card for vendors. Why should they bother implementing worthy features if customers are not banging the table for them?

As regards academic/teaching langauges, as @Damien_The_Unbeliever correctly identifies, a Tutorial D CONSTRAINT are always 'schema'-level, hence allow for global constraints of arbitrary conplexity by definition. If you are looking to design your own DBMS(!!) with this kind of functionality, you should consider implementing the D specification while using an existing SQL DBMS for storage, as Dataphor have done.


A question has been bothering me: given that existing 'industrial strength' SQL DBMSs support triggers, why don't they simply map declarative CREATE ASSERTION to a trigger under the covers? I've long suspected the answer is because they know that performance would be appalling given their legacy technology.

A more satisfying answer is provided in Applied Mathematics for Database Professionals  By Lex de Haan, Toon Koppelaars, chapter 11. They define various execution models to use when using triggers to enforce multi-tuple constraints. The most sophisticated (though still highly doable) model, which they call EM6, involves the following steps:

  1. Translate the formal specification into a constraint validation query.
  2. Develop code to maintain transition effects.
  3. Devise transition effect (TE) queries that ensure the constraint validation query is only run when necessary [e.g. Can I limit the check to only the updated rows? Can a DELETE ever violate this constraint? Are there only specific columns that an UPDATE must involve to require the constraint to be checked? etc]
  4. Discover a means to optimize the constraint validation query by having the TE query provide values that can be used in the validation query.
  5. Devise and add a serialization strategy to the data integrity (DI) code. [i.e. solve the concurrecy problem where a transaction cannot read the 'bad' data another transaction is writing].

They then assert (no pun intended!):

Because we believe that it is not possible for a DBMS vendor to program an algorithm that accepts an arbitrarily complex predicate and then computes efficient transition effect (TE) queries, a minimal validation query, and optimal serialization code to implement execution model EM6, we should not expect full support for multi-tuple constraints—in a practical, usable and acceptable way—from these vendors in the future. The best we can hope for is that database researchers first come up with more common classes of constraints and develop convenient shorthands for these. The DBMS vendors, in their turn, should then provide us with new declarative constructs, consistent with these shorthands, to state these common classes of constraints easily to the DBMS. Given such a common class declaration, the DBMS vendor should be able to program an algorithm that provides us with an EM6-like execution model under the covers to implement the constraint.

One such common class of database constraint is a foreign key, which is already widely implemented, of course.

Dai
  • 141,631
  • 28
  • 261
  • 374
onedaywhen
  • 55,269
  • 12
  • 100
  • 138
  • "The only SQL-like product I know of that currently supports subqueries in CHECK constraints, which enables table-level constraints, is the Access database engine (ACE, Jet, whatever)." In SQL Server you can implement a `CHECK` constraint against the output of a UDF. The UDF can encapsulate almost any kind of query you want. This technique is not recommended in SQL Server because it has [many pitfalls](http://dba.stackexchange.com/a/4796/2660). – Nick Chammas Mar 29 '12 at 01:02
  • @NickChammas: thanks, I've been aware of the [trouble with `CHECK` constraints](http://consultingblogs.emc.com/davidportas/archive/2007/02/19/Trouble-with-CHECK-Constraints.aspx) implemented in this fashion for a while. Note I've changed the text you quoted because I have since been made aware of [current SQL products that support `CREATE ASSERTION`](http://stackoverflow.com/questions/4130039/does-sql-server-2008-support-the-create-assertion-syntax/4131672#4131672), though I have yet to try them out. – onedaywhen Mar 29 '12 at 07:44
  • 1
    You can also implement a CHECK constraint in Postgres based on a function which in turn does a SELECT. And I'm pretty sure you can do this in Oracle as well. –  Feb 25 '13 at 08:27
  • @a_horse_with_no_name: Q1) does the ability to defer constraints mean that these products do not suffer the same problems as SQL Server (where constraints are always checked at the row-level i.e. too enthusiastic)? Q2) Does the optimizer cope well in the context of constraint checking? (I suspect not, otherwise the aforementioned Toon Koppelaars wouldn't make a living out of his product www.rulegen.com ;) – onedaywhen Feb 25 '13 at 10:27
  • @a_horse_with_no_name embedding a query into function called from a `CHECK` sacrifices transaction safety. Some discussion from http://stackoverflow.com/questions/10179121/sql-sub-queries-in-check-constraint – beldaz Mar 06 '15 at 21:38
  • Great answer. We built Dataphor to support DB-wide constraints, but we only took the first step as far as optimizing them, which is to determine the dependencies and attach table-level constraint checks (of the entire assertion) to each. The next step would be to, for a subclass of expressions, attempt to decompose the expression to a perspective dependent row level constraint where possible. Key relationships would be a good start. With that, referential integrity constraints could be implemented as mere DB-wide constraints. We haven't done that yet, so send us a pull-request. ;-) – N8allan Dec 08 '16 at 02:31
  • 2
    MySQL 8.0.16 finally implemented CHECK constraints: https://dev.mysql.com/doc/refman/8.0/en/create-table-check-constraints.html – iwis Feb 22 '21 at 00:39
  • @iwis: Thanks, I hadn't noticed. FWIW what the docs refer to as a 'table' constraint is actually a row-level constraint. – onedaywhen Mar 09 '21 at 09:16
4

My 2 pennies:

  • Concurrency: If you take the "CHECK constraint using a scalar udf with table access" solution for SQL Server, it simply isn't safe. An assertion would probably be the same in Sybase/SQL Server type engines

Edit: what I mean is described in these links: Scalar UDFs wrapped in CHECK constraints are very slow and may fail for multirow updates and from Tony Rogerson

  • Performance: A 10k row insert would require 10k udf-style executions each with table access. Ouch. Given the ASSERTION is per row, then this is how it has to operate. If it could work for "all rows in the INSERT", then it's simpler as a trigger, no?

  • Design: There are patterns (superkey, subtype tables etc) for maintaining data integrity on related tables using simpler constraints. If you need to check some random table for data integrity I'd say you have something wrong...

gbn
  • 422,506
  • 82
  • 585
  • 676
  • You don't think a `CHECK field IN (1, 2, 4, 9, 16)` constraint is something good? Maybe at insertion time, there is a little overhead, but if the query optimiser knows there can be only 5 values, it can devise an entirely different execution plan, which might be a lot faster, than without the `CHECK` clause... – Lukas Eder Jun 16 '11 at 07:42
  • @Lukas Eder: I don't mean a simple constraint like this. Any reasonable performance hit is acceptable to maintain data integrity. I mean scalar udfs in a CHECK. See update with links please – gbn Jun 16 '11 at 07:46
  • The link for the UDF wrap is producing an error. Did Tony Rogerson remove the article? – Jamie Dec 15 '19 at 15:47
3

(Non-SQL, generally considered academic, not called ASSERTION) D (aka. Tutorial D) has CONSTRAINT which can be an arbitrary constraint on the database. One implementation is called Rel

Damien_The_Unbeliever
  • 234,701
  • 27
  • 340
  • 448
  • 1
    Plus Tutorial D supports multiple assignment, which helps in practise :) p.s. D is a language spec, Tutorial D is a language based on the spec, and Rel is a implementation of (and supports nearly all the features of) the Tutorial D language. – onedaywhen Jun 16 '11 at 10:18
2

There is some rudimentary support in Oracle 10g:

http://download.oracle.com/docs/cd/B28359_01/appdev.111/b28419/d_assert.htm

I'm sure, other RDBMS with stored procedure support ship with similar built-in procedures... Clearly, that's not part of any SQL standard, though. The standard does specify:

<assertion definition> ::=
          CREATE ASSERTION <constraint name> <assertion check>
          [ <constraint attributes> ]

<assertion check> ::=
          CHECK <left paren> <search condition> <right paren>

See chapter 11.34 of http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt

I'm not sure about the CREATE ASSERTION statement. I haven't encountered it in any RDBMS so far, and I don't think many RDBMS actually implement that. On the other hand, a CHECK clause on a single table column can be considered an assertion, too.

Lukas Eder
  • 211,314
  • 129
  • 689
  • 1,509
1

"If you can, please note any database packages that implement it as well"

SIRA_PRISE supports this feature in full. Disclosure : I am the author.

(As for the actual question "why aren't DBMS vendors offering this" : you'd really have to ask the vendors, but (a) it is an extremely difficult problem to solve even in a truly relational system, (b) that complexity is severely exacerbated by the fact that SQL is nowhere near truly relational and (c) it is highly probable that most programmers believe themselves to be competent enough to achieve the same effects as declarative ASSERTIONS using triggers, and this at a not so much higher development cost.)

Erwin Smout
  • 18,113
  • 4
  • 33
  • 52
1

Yeah, I asked this question to my instructor in the class a few months ago. It may be in standards but many DBMS vendors do not obey standards at all.

Probably, the reason behind DBMSes to not to support assertions is it is a very expensive operation and nobody is going to use it for this reason.

In order to provide a convenient way, DBMSes implemented TRIGGERs and CHECKs. By this way, nobody needs assertions indeed.

ahmet alp balkan
  • 42,679
  • 38
  • 138
  • 214
  • I agree. If you check out the assertion definition (see my answer), it can be seen that any type of check can be globally defined, for the scope of a schema/database/user, etc. It is almost impossible to find an efficient way to provide so much functionality... – Lukas Eder Jun 16 '11 at 07:36
  • 5
    Nobody *needs* `CHECK`,`FOREIGN KEY`, `UNIQUE` or `PRIMARY KEY` constraints - these could all be implemented by triggers. But having them available makes for more readable (and, importantly, more optimizable) databases. – Damien_The_Unbeliever Jun 16 '11 at 07:36
  • 1
    @Damien, that's not entirely correct. In Oracle, formal `CHECK`, `FOREIGN KEY`, `UNIQUE` or `PRIMARY KEY` specifications are heavily used by the query optimiser's query transformation facilities. They are much more expressive than triggers... – Lukas Eder Jun 16 '11 at 07:38
  • 1
    @Lukas - I did say that having them was important for optimization purposes (and not just for Oracle). But purely for their data integrity purposes, they can all be implemented as triggers. – Damien_The_Unbeliever Jun 16 '11 at 07:40
  • Well actually let's say you have a cascading option that disallows deletion of a referenced row (with foreign key). I'm not sure that `TRIGGER`s are able to disallow deletion of referenced rows etc. – ahmet alp balkan Jun 16 '11 at 15:30
0

Oracle is considering this as possible enhancement for the next version after 12c Release 2, see this post from May 2016 by Toon Koppelaars.

If you are interested in the feature, I suggest upvoting it.

Blama
  • 254
  • 3
  • 9