Initial Response
What is the best way to achieve this?
This is a simple and common Data Modelling Problem, and the answer (at least in the Relational Database context) is simple. I would say, every database has at least a few of these. Unfortunately, because the authors who write books about the Relational Model, are in fact completely ignorant of it, they do not write about this sort of simple straight-forward issue, or the simple solution.
What you are looking for is an OR
gate. In this instance, because the Item
is in a Location XOR
it is InTransit,
you need an XOR
gate.
In Relational terms, this is a Basetype::Subtype structure. If it is implemented properly, it provides full integrity, and eliminates Nulls.
As far as I know, it is the only Relational method. Beware, the methods provided by famous writers are non-relational, monstrous, massively inefficient, and they don't work.
###Record ID
But first ... I would not be serving you if I didn't mention that your files have no integrity right now, you have a Record Filing System. This is probably not your fault, in that the famous writers know only pre-1970's Record Filing Systems, so that is all that they can teach, but the problem is, they badge it "relational", and that is untrue. They also have various myths about the RM, such as it doesn't support hierarchies, etc.
By starting with an ID
stamped on every table, the data modelling process is crippled
You have no Row Uniqueness, as is required for RDBS.
an ID
is not a Key.
If you do not understand that, please read this answer.
I have partially corrected those errors:
In Item
, I have given a more useful PK. I have never heard any user discuss an Item RecordId
, they always uses Codes.
Often those codes are made up of components, if so, you need to record those components in separate columns (otherwise you break 1NF).
Item
needs an Alternate Key on Name,
otherwise you will allow duplicate Names.
In Location,
I have proposed a Key, which identifies an unique physical location. Please modify to suit.
If Location
has a Name,
that needs to be an AK.
I have not given you the Predicates. These are very important, for many reasons. The main reason here, is that it will prove the insanity of Record IDs. If you want them, please ask.
If you would like more information on Predicates, visit this Answer, scroll down (way down!) to Predicate, and read that section. Also check the ERD for them.
###Solution
What changes [do] I need to apply to the above schema?
Try this:
Item History Data Model
(Obsolete, refer below for the updated mode, in the context of the progression)
If you are not used to the Notation, please be advised that every little tick, notch, and mark, the solid vs dashed lines, the square vs round corners, means something very specific. Refer to the IDEF1X Notation for a full explanation, or Model Anatomy.
If you have not encountered Subtypes implemented properly before, please read this Subtype Overview
That is a self-contained document, with links to code examples
There is also an SO discussion re How to implement referential integrity in subtypes.
When contemplating a Subtype cluster, consider each Basetype::Subtype pair as a single unit, do not perceive them as two fragments, or two halves. Each pair in one fact.
ItemHistory
is an event (a fact) in the history of an Item.
Each ItemHistory
fact is either a Location
fact XOR
an InTransit
fact.
Each of those facts has different attributes.
Notice that the model represents the simple, honest, truth about the real world that you are engaging. In addition to the integrity, etc, as discussed above, the result is simple straight-forward code: every other "solution" makes the code complex, in order to handle exception cases. And some "solutions" are more horrendous than others.
Dr E F Codd gave this to us in 1970. It was implemented it as a modelling method in 1984, named IDEF1X. That became the standard for Relational Databases in 1993. I have used it exclusively since 1987.
But the authors who write books, allegedly on the Relational Model, have no knowledge whatsoever, about any of these items. They know only pre-1970's ISAM Record Filing Systems. They do not even know that they do not have the Integrity, Power, or Speed of Relational Databases, let alone why they don't have it.
Date, Darwen, Fagin, Zaniolo, Ambler, Fowler, Kimball, are all promoting an incorrect view of the RM.
Response to Comments
1) ItemHistory, contains Discriminator column 'InTransit'.
Correct. And all the connotations that got with that: it is a control element; its values better be constrained; etc.
Shall it be enum with the value Y / N?
First, understand that the value-stored has meaning. That meaning can be expressed any way you like. In English it means {Location|InTransit}.
For the storage, I know it is the values for the proposition InTransit
are {True|False},
...
In SQL (if you want the real article, which is portable), I intended it as a BIT
or BOOLEAN.
Think about what you want to show up in the reports. In this case it is a control element, so it won't be present in the user reports. There I would stick to InTransit={0|1}.
But if you prefer {Y|N},
that is fine. Just keep that consistent across the database (do not use {0|1}
in one place and {Y|N}
in another).
For values that do show up in reports, or columns such as EventType, I would use {InTransit|Location}.
In SQL, for implementation, if it BOOLEAN
, the domain (range-of-values) is already constrained. nothing further is required.
If the column were other BOOLEAN
,` you have two choices:
CHECK
Constraint
CHECK @InTransit IN ( "Y", "N" )
Reference or Lookup Table
Implement a table that contains only the valid domain. The requirement is a single column, the Code itself. And you can add a column for short Descriptor that shows up in reports. CHAR(12)
works nicely for me.
ENUM
There is no ENUM
in SQL. Some of the non-SQL databases have it. Basically it implements option [2] above, with a Lookup table, under the covers. It doesn't realise that the rows are unique, and so it Enumerates the rows, hence the name, but it adds a column for the number, which is of course an ID
replete with AUTOINCREMENT,
so MySQL falls into the category of Stupid Thing to Do as described in this answer (scroll down to the Lookup Table section).
So no, do not use ENUM
unless you wish to be glued at the hip to a home-grown, stupid, non-SQL platform, and suffer a rewrite when the database is ported to a real SQL platform. The platform might be stupid, but that is not a good reason to go down the same path. Even if MySQL is all you have, use one of the two SQL facilities given above, do not use ENUM.
2) Why is'ItemHistoryTransit' needed as 'Date' column
(DATETIME,
not DATE,
but I don't think that matters.)
[It] is there in ItemHistory?
The standard method of constraining (everything in the database is constrained) the nature of teh Basetype::Subtype relationship is, to implement the exact same PK of the Basetype in the Subtype. The Basetype PK is(ItemCode, DateTime).
[Why] will only Discriminator not work?
It is wrong, because it doesn't follow the standard requirement, and thus allows weird and wonderful values. I can't think of an instance where that could be justified, even if a replacement constraint was provided.
Second, there can well be more than two occs of ItemEvents
that are InTransit
per ItemCode,`which that does not allow.
Third, it does not match the Basetype PK value.
Solution
Actually, a better name for the table would be ItemEvent.
Labels are keys to understanding.
I have given the Predicates, please review carefully.
Data model updated.
Item Event Data Model