3

This is a follow-up to the comment raised by heximal on one of my other posts. I want to have a deleted_on column to detect deleted records to which heximal suggested that this column is redundant.

Here is his comment:

You're going to use deleted_xx fields to determine that the record is deleted?

imho, the best and most nice way is to add to record some activity attribute of boolean datatype (e.g. the field named ACTIVE). So in order to "delete" record, we must update the value of ACTIVE field and update_date, updated_by in single UPDATE query. In order to select all active records we just have to make query like:

SELECT * FROM MyTable WHERE ACTIVE=1

I know that Oracle Applications use such approach, and I agree

I also read the following posts:

My question is: How to set the unique constraint on a table which has a isActive flag as suggested by heximal above. I do have surrogate keys in all my tables. But I want to ensure the natural key columns (what we call business key columns) have a unique constraint.

If I have a deleted_on field to track deletes, then I can include this column as part of the natural key constraint. So, it allows more than one deleted record with the same business key combination, differing only in the deleted_on date field.

If I have isActive field and use last_updated_on column to track the deletion date, I have to 2 options on the natural key constraint

  1. I can include isActive as part of my natural key constraint. But this would allow max only one deleted record with the same business key combination.
  2. I can include isActive plus last_updated_on as part of natural key constraint. But I see having an extra column deleted_on makes it easier.

Any thoughts? Am I missing something here?

Community
  • 1
  • 1
Siraj Samsudeen
  • 1,624
  • 7
  • 26
  • 35

3 Answers3

1

"My question is how to set the unique constraint on a table which has a isActive flag as suggested by heximal above. I do have surrogate keys in all my tables. But I want to ensure the natural key columns (what we call business key columns) have a unique constraint.

If I have a deleted_on field to track deletes, then I can include this column as part of the natural key constraint. So, it allows more than one deleted record with the same business key combination, differing only in the deleted_on date field."

Even with a deleted_on DATE field as part of your natural key, you still couldn't delete-reinsert-delete all on the same day. May seem pathological, but can you really be certain that the pathological case will NEVER occur ?

If your database needs to reflect the fact that some of the content is "active" in the sense that it is highly relevant for current business, and some other content is "inactive", e.g. in the sense that the only reason for keeping it some time is for archival purposes, then design your database to reflect that fact by defining two tables: one with the "active" content and one with the "archived" content. Perhaps you can use a trigger to automatically achieve the "move-to-archive" whenever a delete occurs.

Doing so, you can have your natural key enforced by the DBMS using a unique constraint on the "active" table, and you can include a deletion-timestamp in the "archive" table, and you may not even need to define any key at all on that table.

Erwin Smout
  • 18,113
  • 4
  • 33
  • 52
  • Thanks for your thoughts. Deleted_on will have a timestamp - so, can handle the pathological case :). Doing 2 tables for every table will pose more problems - what if the deleted and archived is still referred by another table? – Siraj Samsudeen Apr 19 '11 at 14:25
  • Then at any rate, you have a RI problem that SQL is unable to handle. If SQL were as powerful as it should be, then you could define a view that is the union of 'active' and 'archive', and have your FK reference the view. Alas, not doable in any SQL product I know. But hey, what would you do with those referencing rows if you altered a boolean flag or set a delete_date in your main table ? Would you still be joining in those pseudo-deleted rows in queries that involve a join, or would you actively skip those rows, even if the referencing row is itself still active ? Neither look correct. – Erwin Smout Apr 19 '11 at 16:24
0

What about adding a "VERSION" (Integer) instead of an ACTIVE column?

  • Current (Active, non deleted) version is 0
  • Whenever you replace the current version with a new one you UPDATE all the existing rows replacing VERSION with VERSION-1 (so current version becomes -1, previously current version becomes -2 and so on) then INSERT the new current record with VERSION=0.

This Version field can easily be part of the Natural Key, of course.

p.marino
  • 6,244
  • 3
  • 25
  • 36
  • this would lead lot of updates each time there is a new deletion and i would not want it unless there are huge benefits. In the case of a natural need for versioning, it makes sense. But here, I don't see a justification. – Siraj Samsudeen Apr 19 '11 at 09:39
0

I was about to write a big essay when I remembered I already wrote it in
Are there problems with this 'Soft Delete' solution using EAV tables?
If what you want is a simple way of undoing a delete, or keeping a track record of the deletes for audit/troubleshooting purposes, using a mirror table is an easy solution.

and Soft delete best practices (PHP/MySQL)
You need to think about what "delete" really means. It seems like you want to be able to reconstruct history in which case the term "delete" causes confusion. It really isn't a delete. You want to be using some kind of temporal data model instead and introduce the concept of effective dates.

Or if you only care about the data volumes, a simple batch process to DELETE or move rows older than X days/years is also a really simple way.

Community
  • 1
  • 1
Ronnis
  • 12,593
  • 2
  • 32
  • 52