3

I have a laboratory analysis database and I'm working on the bast data layout. I've seen some suggestions based on similar requirements for using a "Shared Primary Key", but I don't see the advantages over just foreign keys. I'm using PostgreSQL:tables listed below

Sample
___________
sample_id (PK)
sample_type (where in the process the sample came from)
sample_timestamp (when was the sample taken)


Analysis
___________
analysis_id (PK)
sample_id (FK references sample)
method (what analytical method was performed)
analysis_timestamp (when did the analysis take place)
analysis_notes

gc
____________
analysis_id (shared Primary key)
gc_concentration_meoh (methanol concentration)
gc_concentration_benzene (benzene concentration)

spectrophotometer
_____________
analysis_id
spectro_nm (wavelength used)
spectro_abs (absorbance measured)

I could use this design, or I could move the fields from the analysis table into both the gc and spectrophotometer tables, and just use foreign keys between sample, gc, and spectrophotometer tables. The only advantage I see of this design is in cases where I would just want information on how many or what types of analyses were performed, without having to join in the actual results. However, the additional rules to ensure referential integrity between the shared primary keys, and managing extra joins and triggers (on delete cascade, etc) appears to make it more of a headache than the minor advantages. I'm not a DBA, but a scientist, so please let me know what I'm missing.

UPDATE: A shared primary key (as I understand it) is like a one-to-one foreign key with the additional constraint that each value in the parent tables(analysis) must appear in one of the child tables once, and no more than once.

Kevin Nowaczyk
  • 227
  • 1
  • 15
  • 4
    *Shared primary key* is not a common term among database designers. It might be helpful to edit your question, and paste a link to the definition. (It looks like a foreign key to me.) – Mike Sherrill 'Cat Recall' Nov 30 '12 at 13:15
  • And if you add data from few more instruments .., and if some analysis requires only one, but not all instruments? – Damir Sudarevic Nov 30 '12 at 14:51
  • This is `one to 0 or 1` relationship, a child table can have 0 or 1 records for each analysis. – Damir Sudarevic Nov 30 '12 at 14:53
  • 5
    This is what is sometimes called supertype/subtype relationships I think. I think you'd get a good response on dba.se if you were willing to get it migrated. –  Nov 30 '12 at 14:56
  • Damir: In either scenario if a new method is used, a new table will have to be created. The analysis table represents the execution of a particular analytical method. This method would use either the GC, the spec, an Atomic Absorption unit, a pH probe, etc. Each sample can have zero to infinite analyses performed on it. From a class model, I think of each sample containing an array of analyses, where each specific analysis type (gc, spec) inherits the fields from the parent, analysis, class. – Kevin Nowaczyk Nov 30 '12 at 17:31
  • @KevinNowaczyk see this then http://stackoverflow.com/search?tab=votes&q=user%3a196713%20subtype – Damir Sudarevic Nov 30 '12 at 18:05
  • @DamirSudarevic, Thanks! That diagram is very similar to what I was describing above and the differences tell me that that is not the way I should design my tables. In that diagram, customers, vendors and distributors are generalized into organizations, so one foreign key to the organization table can represent any of the three children data types. In my case there is no primary key in the parent table which will be used as a foreign key anywhere besides the children. So it looks like in my case there is no advantage to using this design. Regular boring foreign keys it is! – Kevin Nowaczyk Nov 30 '12 at 19:51

2 Answers2

1

I've seen some suggestions based on similar requirements for using a "Shared Primary Key", but I don't see the advantages over just foreign keys.

If I've understood your comments above, the advantage is that only the first implements the requirement that each row in the parent match a row in one child, and only in one child. Here's one way to do that.

create table analytical_methods (
  method_id integer primary key,
  method_name varchar(25) not null unique
);
insert into analytical_methods values
(1, 'gc'),(2, 'spec'), (3, 'Atomic Absorption'), (4, 'pH probe');

create table analysis (
  analysis_id integer primary key,
  sample_id integer not null, --references samples, not shown
  method_id integer not null references analytical_methods (method_id),
  analysis_timestamp timestamp not null,
  analysis_notes varchar(255),
  -- This unique constraint lets the pair of columns be the target of
  -- foreign key constraints from other tables.
  unique (analysis_id, method_id)
);

-- The combination of a) the default value and the check() constraint on 
-- method_id, and b) the foreign key constraint on the paired columns 
-- analysis_id and method_id guarantee that rows in this table match a 
-- gc row in the analysis table. 
--
-- If all the child tables have similar constraints, a row in analysis 
-- can match a row in one and only one child table.
create table gc (
  analysis_id integer primary key,
  method_id integer not null 
    default 1 
    check (method_id = 1),
  foreign key (analysis_id, method_id) 
    references analysis (analysis_id, method_id),
  gc_concentration_meoh integer not null,
  gc_concentration_benzene integer not null
);
Mike Sherrill 'Cat Recall'
  • 91,602
  • 17
  • 122
  • 185
  • I was asking about the merits of the two designs and for advice on which to use in my case. Since what I posted is a dramatic oversimplification of what the table layout will really look like, code is not necessary. There is no requirement for sharing the primary key across different tables, only if that design is the one that is best for my case. – Kevin Nowaczyk Dec 01 '12 at 01:10
  • It will be worth your while to stop talking about sharing primary keys, because that's what *all* foreign key references do. The structure I posted guarantees that each row in analysis can match one row in one and only one "child" table. The structure in Damir's link allows each row in analysis to match one row in *every* "child" table. Your update said "each value in the parent tables(analysis) must appear in one of the child tables once". If that's really a requirement, the structure I posted will implement it, and the linked structure won't. – Mike Sherrill 'Cat Recall' Dec 01 '12 at 05:54
  • Whatever...I'm not the expert, just using the terminology I've seen elsewhere. – Kevin Nowaczyk Dec 02 '12 at 18:05
0

It looks like in my case this supertype/subtype model in not the best choice. Instead, I should move the fields from the analysis table into all the child tables, and make a series of simple foreign key relationships. The advantage of the supertype/subtype model is when using the primary key of the supertype as a foreign key in another table. Since I am not doing this, the extra layer of complexity will not add anything.

Kevin Nowaczyk
  • 227
  • 1
  • 15