I don't know what database you're using, I'll use Postgres because it's free, follows the SQL standard, has good documentation, and is very powerful.
As Gordon said, you seem to have three things: users, meals, and locations. Three things means one table for each. This avoids storing redundant data. The whole topic is database normalization.
create table users (
id bigserial primary key,
username text not null unique,
email text not null unique,
first_name text not null,
last_name text not null
);
create table meals (
id bigserial primary key,
type text not null unique,
-- If calories vary by user, move this into user_meals.
calories integer not null
);
create table locations (
id bigserial primary key,
-- The specific information you have about a location will vary,
-- but this is a good start. I've allowed nulls because often
-- people don't have full information about their location.
name text,
address text,
city text,
province text,
country text,
postal_code text
);
You asked about compound keys. Don't bother. There's too many potential problems. Use a simple, unique, auto-incrementing big integer on every table.
Primary keys must be unique and unchanging. Usernames, names, email addresses... these can all change. Even if you think they won't, why bake the risk into your schema?
Foreign keys will be repeated all over the database and indexes many times. You want them to be small and simple and fixed size to not use up any more storage than necessary. Integers are small and simple and fixed size. Text is not.
Compound primary keys potentially leak information. A primary keys refer to a row, and those often show up in URLs and the like. If we were to use the user's email address or name that risks leaking personally identifiable information.
I've chosen bigserial
for the primary key. serial
types are auto-incrementing so the database will take care of assigning each new row a primary key. bigserial
uses a 64 bit integer. A regular integer can only hold about 2 to 4 billion entries. That sounds like a lot, but it isn't, and you do not want to run out of primary keys. bigserial
can handle 9 quintillion. An extra 32 bits per row is worth it.
Some more notes.
- Everything is
not null
unless we have a good reason otherwise. This will catch data entry mistakes. It makes the database easier to work with because you know the data will be there.
- Similarly, anything which is supposed to be unique is declared
unique
so it is guaranteed.
- There are no arbitrary limits on column size. You might see other examples like
email varchar(64)
or the like. This doesn't actually save you any space, it just puts limits on what is allowed in the database. There's no fundamental limit on how long a username or email address can be, that's a business rule. Business rules change all the time; they should not be hard coded into the schema. Enforcing business rules is the job of the thing inputting the data, or possibly triggers.
Now that we have these three tables, we can use them to record people's meals. To do that we need a fourth table to record a user having a meal at a location: a join table. A join table is what allows us to keep information about users, meals, and locations each in one canonical place. The users, locations, and meals are referred to by their IDs known as a foreign key.
create table user_meals (
id bigserial primary key,
user_id bigint not null references users(id),
meal_id bigint not null references meals(id),
location_id bigint not null references locations(id),
-- Using a time range is more flexible than start + duration.
taken_at tstzrange not null,
-- This can store up to 99999.99 which seems reasonable.
cost numeric(7, 2) not null,
notes text,
-- When was this entry created?
created_at timestamp not null default current_timestamp,
-- When was it last updated?
updated_at timestamp not null default current_timestamp
);
- Because we're using
bigserial
for everything's primary key, referring to other tables is simple: they're all bigint
.
- As before, everything is
not null
unless we have a good reason otherwise. Not every meal will have notes
, for example.
- Fairly standard
created_at
and updated_at
fields are used to record when the entry was created or updated.
- Imprecise numeric types such as
float
or double
are avoided, especially for money. Instead the arbitrary precision type numeric
is used to store money precisely.
numeric
requires we give it a precision and scale; we must choose a limit, so I picked something unreasonably high in my currency. Your currency may vary.
- Rather than storing a start time and meal duration, we take advantage of Postgres's range types to store a time range from when the meal started to when it ended. This allows us to use range functions and operators to make queries about the meal time much simpler. For example,
where taken_at @> '2020-02-15 20:00'
finds all meals which were being taken at 8pm on Feb 15th.
There's more to do, such as adding indexes for performance, hopefully this will get you started. If there's one take away it's this: don't try to cram everything into one table.
Try it out.