In our database model we have a Beneficiary entity. A beneficiary can be a physical person or a corporate beneficiary; a phisical beneficiary has a number of attributes such as name, surname, sex, etc.; in addition, a beneficiary (either corporate or physical person) can either be foreign or not; this further distinction translate into different domain values for a "common" set of attributes (for example in Italy, where I live, tax ids may have a different data format than UK's tax ids).
We are now re-engineering our Beneficiary table, since the developer who initially worked on DB analysis & modeling did a (IMO) short-sighted choice. He put the primary key constraint on attribute BeneficiaryName, wich has been used to store either the Corporate name (e.g. "Microsoft Corporation") in case of Corporate beneficiary or the surname (e.g. Smith) for the physical beneficiary. This way we have the (unacceptable) constraint that we CAN'T have more than 1 beneficiary with surname "Smith" (or a corporate named "Smith") in our DB.
My approach for this "re-factoring" would introduce a generalization for the Beneficiary entity; I would
- Clean Beneficiary table, keeping only common data;
- Add a surrogate primary key to Beneficiary table, let's call it BeneficiaryID;
- Split Beneficiary table, creating two sub-entityes (CorporateBeneficiary & PhysicalBeneficiary, discriminated by a flag in master Beneficiary table), with a 1..1 association to Beneficiary table (a foreign key will reference BeneficiaryID)
- Find (significative) primary keys for CorporateBeneficiary & PhysicalBeneficiary;
This should address the aforementioned problem of uniqueness on BeneficiaryName. Seems ok so far?
The real problem I have is: how can/should I handle the further complication added by "foreign" attribute in this model? Should I leave Foreign as it tis, i.e. a flag attribute in Beneficiary? If so, how can I handle the need for different attributes' for a conceptually similar piece of information (i.e. zipcode, tax id) withoud duplicating the attributes (zipcode_foreign, zipcode, taxid_foreign, taxid etc.)? Should I really strive to accomodate different domain values into one field?
Any suggestion would be welcome...