I did some research but couldn't find a clear answer.
I have a simple many-to-many relational table, the user_id
will be foreign and group_id
will be foreign, is there a reason to add a PRIMARY KEY?
I did some research but couldn't find a clear answer.
I have a simple many-to-many relational table, the user_id
will be foreign and group_id
will be foreign, is there a reason to add a PRIMARY KEY?
I'm going to post a dissenting opinion from the accepted answer. You should not add an auto-increment primary key to join (or cross-reference) tables. Here's why.
First, you're not going to use it. The only reason this table exists is to track the many-to-many relationship between your users and groups. Adding the auto-increment field just adds another field that you're not going to read. In other words, the only time you're going to query this table is because you want to see which users belong to a group, or which groups a user is a member of. In both cases, this extra primary key table will not be used.
Secondly, if you add additional data to this table, the primary key would still be useless. Consider: If we add a field to indicate that a group membership is "primary" for a given user, we still need to know both the user and the group for this "primary-ness" to be meaningful.
Technically you never need an auto-incremented, "surrogate" key for any table. You just need a field or set of fields that you know to be unique (thus putting us in at least 2nd Normal Form). The trouble is that most of the time, we're very bad at predicting what will be sufficiently unique, so we add this extra key to protect ourselves from that problem. Cross-references are not one of the cases we're bad at predicting. It will never be the case that a user will be a member of the same group twice, and it will never be the case that a group will contain the same user twice. We know that combination to be unique, so the additional surrogate key adds nothing.
However, note that if you're adding fields to a table like this (other than the two foreign keys), you may actually be modeling something that's not really a many-to-many. You may actually be modeling one-to-many-to-one. For instance, if you're adding fields to the mapping of users to groups, you may actually be modeling a "Membership" entity. That might have earned a surrogate key, because it might start to make sense to have one user with different memberships to the same group at different times. But again, that's modeling something else than a many-to-many relationship.
What if you ever want to do something more interesting with that table (let's call it table A), such as make a satellite off of A or have another many-to-many table connecting A and some other table B?
Might there be a time where you want a table to reference an occurrence of a certain user_id
with a certain group_id
?
Basically, if you ever want to be referencing table A, or table A is a parent of some other table, you need a primary key on table A. And in my mind, in a relational DB, it's very likely you'll want to be referencing it at some point.
So I say add an auto-increment PK (assuming you're okay with duplicate user_id, group_id
rows, like @Leslie said in a comment.
Also, note that if you give it an auto-increment PK, it would still be worthwhile to put an index on user_id
and an index on group_id
for joining.
Other similar SO questions: