How do I link two tables, People
and Messages
, when there is no single column in the Messages
table that corresponds to the primary key of the People
table?
I have two tables: one containing email messages called Messages
and one of people called People
.
People
| person_id | email_address |
| -------- | ---------------- |
| 001 | `billy@gmail.com` |
| 002 | `bobby@gmail.com` |
| 003 | `gregg@gmail.com` |
| 004 | `janet@gmail.com` |
| 005 | `boss1@gmail.com` |
Messages
| msg_id | from_addr | to_addr | cc_addr | bcc_addr |
| ------ | ----------------- | ------------------------------------ | ----------------- | ----------- |
| 001 | `billy@gmail.com` | `gregg@gmail.com`, `david@gmail.com` | `boss1@gmail.com` | NULL |
| 002 | `bobby@gmail.com` | `gregg@gmail.com` | NULL | NULL |
| 003 | `janet@gmail.com` | `boss1@gmail.com` | `bobby@gmail.com` | `gregg@gmail.com` |
I will be programmatically passed a person_id
(say 005
for boss1@gmail.com
) and want to be able to query the Messages
dataset to get all "involved" emails--where a boss1@gmail.com
was sent, received, was cc'd or bcc'd in the email message. So with the dataset above, the query ought to return the rows with msg_id
001
and 003
.
I don't know to map the relation between the People
and Messages
tables.
Say the primary key of of the People
table is person_id
. There is no single column that I can link that to in the Messages
table to act as a foreign key.
I cannot determine how to represent this data correctly.
Am I doing this wrong? What is a better way to represent these two tables with more tables structured differently?
After normalising to FNF the data can be represented as follows.
Messages_FNF
| msg_id | from_addr | to_addr | cc_addr | bcc_addr |
| ------ | ----------------- | ----------------- | ----------------- | ---------------- |
| 001 | `billy@gmail.com` | `gregg@gmail.com` | `boss1@gmail.com` | NULL |
| 001 | `billy@gmail.com` | `david@gmail.com` | `boss1@gmail.com` | NULL |
| 002 | `bobby@gmail.com` | `gregg@gmail.com` | NULL | NULL |
| 003 | `janet@gmail.com` | `boss1@gmail.com` | `bobby@gmail.com` | `gregg@gmail.com` |
However, the primary key (that uniquely identifies each row in the table) for Messages_FNF
would have to be a composite key combining msg_id
, from_addr
, to_addr
, cc_addr
, bcc_addr
.
So how do I link the People
and Messages
tables together using Primary and Foreign keys?
Or must I further normalise or partition these tables such that the People
table and Messages_FNF
have a meaningful matching field - allowing for easy joins? If so, how?
I've read that the cross join lets you join tables that do not have a common field but I worry that it will result in a VERY large table with unnecessary rows and be computationally expensive.
Is linking the tables with a PK->FK the right way of structuring the data in this problem?