Should we combine Surrogate Key with Natural Key(Unique Index Mean) for readability and easy query selection?

Question

Edit: currently I will keep question opened. to get an very rough answer.

Shortly, Here's a schema

Asked this question because From this topic https://www.sisense.com/blog/when-and-how-to-use-surrogate-keys/

Combining Natural and Surrogate Keys Certain business scenarios might require keeping the natural key intact as a means for users to interact with the database. In these cases …

If a natural key is recommended, use a surrogate key field as the primary key, and a natural key as a foreign key. While users may interact with the natural key, the database can still have surrogate keys outside of the users’ view, with no interruption to user experience.

If a natural key must be used without an additional surrogate key, be sure to combine it with a surrogate key element. In our financial database example, Expense Reports (ER-123) have a natural key is used in conjunction with a surrogate sequential key. This format prevents many of the natural key side effects listed above.

I really found that maybe solve my problem. As am afraid from surrogate key. But XPO and ORM providers support Surrogate key by (100%). unlike Composite Key. Which described in many articles as a bad in new databases design. Surrogate Key vs Natural Key for EF Surrogate vs. natural/business keys

And many articles about that. This is not case here. Am speak about combine or use both surrogate key and also add natural key (as indexes or unique indexes only not PK) Now back to schema above and topic above that speak about combine both. I have a Branch which have surrogate key Oid, all tables have Oid as surrogate key. Also I have natural key for example (BranchID),

Branch Table have (Oid[surrogate], BranchID[Natural Key - Unique Index not PK]
InvTransHed have (Oid[Surrogate], itd_brn, itd_type, ith_num) as Natural Key (Unique Index I mean)
InvTransDet have (Oid[Surrogate], itd_brm, itd_typ, itd_num, itd_lne as Natural Key [Unique Indexs])

What I actually make is linking all Surrogate keys Oid with its Foreign Key side. for example: Branch.Oid Linked with InvTransHed.BranchSKey [Skey = surrogate key]

Why I need to combine Surrgoate Key with Natural Key(as Unique Indexes)?

Easy create reporting with JOINs (Branch with InvTransDet directly without moving to InvTransHed).
Readability for technical support. easy to make any join without care about surrogate key. or Linking to parent tables till reach what we need.
Easy to understand and ORM Providers (friendly by 100% for sure)

Here's a questions that blown my head:

Should I linking Surrogate Keys only with their another side Foreign Key. Or I must link also Natural Indexes to their Natural Keys. Branch.BranchID => InvTransHed.ith_brn?
About naming opposite side FK. Branch.Oid surrogate linking with InvTransHed.BranchSkey. Is naming important here to be same for readability?
Guide me please for that (Title = Question).

Avoid composite keys. If you use natural (autonumber) keys, don't need to link surrogates and don't carry surrogate into dependent tables. If you use surrogate keys then please specify as primary and link to its corresponding foreign and the autonumber is irrelevant. I have used each approach in different databases. Yes, naming is important for readability and reduce confusion. — June7, Feb 19 '21 at 20:15
OK what about Master-Detail tables. with surrogate keys. when delete Header I need also delete detail rows. what you will do in that way — deveton, Feb 19 '21 at 20:22
I don't know about SQLServer but in Access can set Enforce Relational Integrity relationship to prevent orphans in dependent detail table. Doesn't matter if key is natural or surrogate. Otherwise, manage data integrity with code. I had concept of natural and surrogate backwards. Surrogate is autonumber, natural is based on real world identifier such as SSN, VIN, etc. — June7, Feb 19 '21 at 22:31
Are you only asking if the index for the surrogate key column should include the FK or if it is enough to only cover it? — Chris Schaller, Feb 20 '21 at 01:46
Does this answer your question? [Unique Constraint vs Unique Index](https://stackoverflow.com/questions/3296230/unique-constraint-vs-unique-index) — Chris Schaller, Feb 20 '21 at 01:48

Chris Schaller · Answer 1 · 2021-02-20T02:28:17.887

1

Constraints like PKs and FKs are our primary mechanisms for documenting and enforcing the referential integrity of the database.

We can't rely on documentation and data schema naming conventions alone to explain to developers and users how to interpret the data. The whole point around using an RDBM is to maintain referential integrity. Referential and any check constraints form the basis of "truth" about the rules that must be adhered to when inserting data into the database.

You have rightly pointed out that you want to enforce uniquness of a combination of an FK and a Natural Surrogate key, this is therfore primarily a constraint.

If the existence of the Constraint is not in question then we only need to consider the Index.

Unique Constraint vs Unique Index

It is not necessary for a Unique Constraint to also be paired with a Unique Index however for discovery by developers it is simply more documentation or at least more discoverable if you also define it as an Index.

When or If performance becomes an issue, you might change the uniqueness of the index or remove it entirely, however you would always want to maintain the unique constraint in this scenario to enfore the integrity of the data.

While not necessary, a consistent naming convention helps with the readability of the data schema and to explain the intent of your structure, consistency is the key though, the first time you violate your convention the reader must now doubt or second guess all the previous assumptions.

For this reason, if conventions might be questioned, existence of constraints will always provide a definitive answer.

The actual naming convention you use becomes personal preference of the designer. I use Id as the PK and {tablename}Id as the FK. When the link is ambiguous, like when a Table has FKs to itself, or multiple FKs to the same table, then I prefix the column with the noun or verb that describes the relationship: CreatedBy_UserId vs ModifiedBy_UserId

I feel that this convention lends itself to readable and natural SQL and C# syntax.

SELECT ID, Description, Modified, ModifiedBy.Name
FROM Product
INNER JOIN User [ModifiedBy] ON Product.ModifiedBy_UserId = ModifiedBy.Id

As a convention, I know that the PK is always Id, So where ever it is used you can assume that it is the principal end of the relationship.

edited Feb 20 '21 at 02:28

answered Feb 20 '21 at 02:17

Chris Schaller

13,704
3
43
81

thanks for these tips. But how to deal with Master-Detail. How to insert Master and make sure that Detail (surrogate key of master column) is ok? In details table it will be MasterId. alright. How to make sure that MasterId in details table are equal Id(PK) of master table? – deveton Feb 20 '21 at 03:01
Because in insert statement you can't know the surrogate keys what it will be... – deveton Feb 20 '21 at 03:01
In reality you _always_ know the FK values, You never try to blindly insert a child record, the `MasterId` must be provided, that is why we make this a constraint, we force the client process to deal with this. The PK of the child record is generated for us, so we don't need to know what it might be until after the insert. – Chris Schaller Feb 20 '21 at 05:36
I think you might have your terminology back to front a bit, which confuses this discussion. With the surrogate key, the constraint again forces the client to at least not provide an invalid number. It is usually either a business domain value, like a check number of a reference to an external system, the database can _rarely_ provide that, but if it is a _sequence_ number like a _line number_, then you can use a trigger to manage that if you need to. – Chris Schaller Feb 20 '21 at 05:38
The database schema shouldn't be concerned that the number used matches a specific sequence, it cares only that it is unique. It is the business domain that might want to enforce a specific sequence or a specific value, so leave the specific value to the business logic layer, do not try to solve it in the database schema. - Triggers and Stored Procedures are your programming surface within the database if you need to use them to enforce business constraints. – Chris Schaller Feb 20 '21 at 05:40
I was think that combine columns into surrogate int value. (surrogate = col1 + col2 + col3) That will generate an integer math number (arbitrary for user) -> This allow master-detail insertion to be 100% sure what is done. Correct me or what behavior you can got. I don't think Identity(1,1) is a useful way to provide a data. sometimes you need to add old transactions from other databases. Creating temp tables, etc is a trash things to do in real-life or technical support overhead. – deveton Feb 20 '21 at 05:41
Am wrong about surrogate = (col1 + col2+ col3). It can be duplicated during 1 +2 +3 = 6 and maybe 2 + 1+ 3 = 6.... I really hate identity records. Wish for help.. – deveton Feb 20 '21 at 05:49
Oh my, please don't do that! The generally accepted pattern is to us an _Identity_ column, or in Oracle a _Sequence_ constraint. This is part of your table Create script. After `INSERT` you have to query to get the Id of the inserted row, or use output clauses in your insert statements. The point of an arbitrary key is that it does not _EVER_ need to match an external system. – Chris Schaller Feb 20 '21 at 05:51
Don't hate on the Identity column, it's there to make relational issues simpler. But if you must use your surrogate value as the PK, then that's up to you, but you will have to manage that entirely in program logic – Chris Schaller Feb 20 '21 at 05:54
what about last answer here... https://stackoverflow.com/questions/54572044/how-to-get-value-greater-than-multiple-columns-value-rather-than-using-composite/54572865 I don't know why Identity key is very bad approach. what about copying a data from same application. from one client database to current database in same application. You can't sure 100% if Master table Id is already exists or not. IF its not exist. Insertion will be OK. but what about Details table? how you get a new Master.Ids that need to inserted to Details table? In non-identity surrogate key. You can easily know keys – deveton Feb 20 '21 at 07:03
In most cases in many companies I see. they need to copy date from old EX:- copy old users table (100+ rows) from other company use same DB to new company use same DB. Also excel sheets will be happy with non-identity PK's when you insert Items to SQL. with identity column. You can copy header table easily. but in Details table how to know Master.Ids which are new and already inserted? any mistake here can't be rolled-back. But if header.Id is generate surrogate. It can generated easily in Details table (DetailsTable.HeaderId). But yes wish for more hope. guides. many days and confusing. :( – deveton Feb 20 '21 at 07:06
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/228986/discussion-between-chris-schaller-and-deveton). – Chris Schaller Feb 20 '21 at 07:46

SQLpro · Answer 2 · 2021-02-22T11:42:56.630

You should never be oblige to have a surrogate key and a primary key, except when the natural key is required. Because the natural key is subject to :

a higher length rather than a technical PK (INT IDENTITY as an example)
some updates because of user entry errors
a dispersion of values that is erratic
and the most terrific : datatype can change in the life of the database (recently Europe standardized the vehicle registration so every european countries has to change the datatype for it !)

The longer a key, the poorer the performance and costly maintenance.

Updating a PK is a nightmare especially when there is many Foreign Keys relying them!

When the values of a key as an erratic dispersion, the stats collected are less acurate so it will potentially results less good execution plans

Changing the datatype of a natural key will block many tables for a long time...

But don't forget that you can have Foreign Keys relying on Unique keys. A FK constraint is a contraint based on another UNIQUE or PK constraint...

Should we combine Surrogate Key with Natural Key(Unique Index Mean) for readability and easy query selection?

2 Answers2