108

In which situations you should use inherited tables? I tried to use them very briefly and inheritance didn't seem like in OOP world.

I thought it worked like this:

Table users has all fields required for all user levels. Tables like moderators, admins, bloggers, etc but fields are not checked from parent. For example users has email field and inherited bloggers has it now too but it's not unique for both users and bloggers at the same time. ie. same as I add email field to both tables.

The only usage I could think of is fields that are usually used, like row_is_deleted, created_at, modified_at. Is this the only usage for inherited tables?

BinaryButterfly
  • 18,137
  • 13
  • 50
  • 91
raspi
  • 5,962
  • 3
  • 34
  • 51

8 Answers8

125

There are some major reasons for using table inheritance in postgres.

Let's say, we have some tables needed for statistics, which are created and filled each month:

statistics
    - statistics_2010_04 (inherits statistics)
    - statistics_2010_05 (inherits statistics)

In this sample, we have 2.000.000 rows in each table. Each table has a CHECK constraint to make sure only data for the matching month gets stored in it.

So what makes the inheritance a cool feature - why is it cool to split the data?

  • PERFORMANCE: When selecting data, we SELECT * FROM statistics WHERE date BETWEEN x and Y, and Postgres only uses the tables, where it makes sense. Eg. SELECT * FROM statistics WHERE date BETWEEN '2010-04-01' AND '2010-04-15' only scans the table statistics_2010_04, all other tables won't get touched - fast!
  • Index size: We have no big fat table with a big fat index on column date. We have small tables per month, with small indexes - faster reads.
  • Maintenance: We can run vacuum full, reindex, cluster on each month table without locking all other data

For the correct use of table inheritance as a performance booster, look at the postgresql manual. You need to set CHECK constraints on each table to tell the database, on which key your data gets split (partitioned).

I make heavy use of table inheritance, especially when it comes to storing log data grouped by month. Hint: If you store data, which will never change (log data), create or indexes with CREATE INDEX ON () WITH(fillfactor=100); This means no space for updates will be reserved in the index - index is smaller on disk.

UPDATE: fillfactor default is 100, from http://www.postgresql.org/docs/9.1/static/sql-createtable.html:

The fillfactor for a table is a percentage between 10 and 100. 100 (complete packing) is the default

BinaryButterfly
  • 18,137
  • 13
  • 50
  • 91
S38
  • 1,332
  • 1
  • 8
  • 4
  • 23
    Another example of partioning – Frank Heikens Jun 19 '10 at 11:20
  • 5
    In your item 1, how does Postgres understand which of the tables is needed to search in? You select from the parent table, and dates range is only a convenient example of splitting. Parent table can't know this logic. Or I'm wrong? – Alexander Palamarchuk Sep 28 '12 at 10:45
  • 4
    Performing a query on the parent table is effectively the same as performing a query on a UNION ALL across every descendant table on common rows. The query planner is aware of the check constraints that define each partition, and as long as they don't overlap partitions uses them to determine that it can skip checking tables for which CHECKs indicate no rows would be returned. [Postgres docs on this](http://www.postgresql.org/docs/9.1/static/ddl-partitioning.html) – zxq9 Apr 23 '13 at 12:23
  • 1
    @avesus heh... The code above taken by itself is worthy of such sarcasm. It is typical to wrap this sort of thing up into a maintenance routine of some sort. This can be as simple as a stored procedure that takes care of it on some condition, a cron job, or whatever. It is common to partition by date, but I've found myself partitioning by tablespace allocation from time to time as well, and that requires some external information -- the 30 minutes it takes to write a partition babysitter is well worth it for the control it gives you. – zxq9 Jan 05 '15 at 13:14
  • Hmm. Are you sure it doesn't block? I have a similar setup, but when I run the CLUSTER command on a single partition, a SELECT statement on data held by another partition blocks! – E. van Putten Sep 03 '20 at 10:23
47

"Table inheritance" means something different than "class inheritance" and they serve different purposes.

Postgres is all about data definitions. Sometimes really complex data definitions. OOP (in the common Java-colored sense of things) is about subordinating behaviors to data definitions in a single atomic structure. The purpose and meaning of the word "inheritance" is significantly different here.

In OOP land I might define (being very loose with syntax and semantics here):

import life

class Animal(life.Autonomous):
  metabolism = biofunc(alive=True)

  def die(self):
    self.metabolism = False

class Mammal(Animal):
  hair_color = color(foo=bar)

  def gray(self, mate):
    self.hair_color = age_effect('hair', self.age)

class Human(Mammal):
  alcoholic = vice_boolean(baz=balls)

The tables for this might look like:

CREATE TABLE animal
  (name       varchar(20) PRIMARY KEY,
   metabolism boolean NOT NULL);

CREATE TABLE mammal
  (hair_color  varchar(20) REFERENCES hair_color(code) NOT NULL,
   PRIMARY KEY (name))
  INHERITS (animal);

CREATE TABLE human
  (alcoholic  boolean NOT NULL,
   FOREIGN KEY (hair_color) REFERENCES hair_color(code),
   PRIMARY KEY (name))
  INHERITS (mammal);

But where are the behaviors? They don't fit anywhere. This is not the purpose of "objects" as they are discussed in the database world, because databases are concerned with data, not procedural code. You could write functions in the database to do calculations for you (often a very good idea, but not really something that fits this case) but functions are not the same thing as methods -- methods as understood in the form of OOP you are talking about are deliberately less flexible.

There is one more thing to point out about inheritance as a schematic device: As of Postgres 9.2 there is no way to reference a foreign key constraint across all of the partitions/table family members at once. You can write checks to do this or get around it another way, but its not a built-in feature (it comes down to issues with complex indexing, really, and nobody has written the bits necessary to make that automatic). Instead of using table inheritance for this purpose, often a better match in the database for object inheritance is to make schematic extensions to tables. Something like this:

CREATE TABLE animal
  (name       varchar(20) PRIMARY KEY,
   ilk        varchar(20) REFERENCES animal_ilk NOT NULL,
   metabolism boolean NOT NULL);

CREATE TABLE mammal
  (animal      varchar(20) REFERENCES animal PRIMARY KEY,
   ilk         varchar(20) REFERENCES mammal_ilk NOT NULL,
   hair_color  varchar(20) REFERENCES hair_color(code) NOT NULL);


CREATE TABLE human
  (mammal     varchar(20) REFERENCES mammal PRIMARY KEY,
   alcoholic  boolean NOT NULL);

Now we have a canonical reference for the instance of the animal that we can reliably use as a foreign key reference, and we have an "ilk" column that references a table of xxx_ilk definitions which points to the "next" table of extended data (or indicates there is none if the ilk is the generic type itself). Writing table functions, views, etc. against this sort of schema is so easy that most ORM frameworks do exactly this sort of thing in the background when you resort to OOP-style class inheritance to create families of object types.

zxq9
  • 13,020
  • 1
  • 43
  • 60
  • What if you were adding every known mamal? Would you inherit from mammal or have a foreign key like you did here? The problem I have with foreign keys is that you end up having to do so many joins. – puk Jul 31 '15 at 22:02
  • 2
    @puk You would first need to decide *why* you were adding every known mammal. The shape of the data is going to be determined by the way that data will be used (it probably isn't necessary to have a table per animal in this case -- consider databases for game bestiaries where you really do have every type of mob). In the case above I would normally add a view that is the most-common case of `mammal JOIN human`, just because writing a join every time is annoying. But *do not avoid joins*. Joins are what puts the R in RDBMS. If you don't like joins you should use a different db type. – zxq9 Aug 01 '15 at 02:28
  • @zxq9: I'm guessing that massive, inefficient joins due to large tables are where materialized views come into play? (I've not been using Postgres for that long) – Mark K Cowan Dec 14 '16 at 18:33
  • 3
    @MarkKCowan Joins are not inefficient. What is inefficient is trying to join on non-indexed, non-unique fields (because the schema isn't anywhere close to being normalized) due to sloppy design. In those cases a materialized view can be helpful. Materialized views are also helpful in the case you need normalized data as your schematic foundation (often true), but also need several working, denormalized representations that are easier to work with either for processing efficiency (front-load the computation) or cognitive efficiency. If you write more than read, it is a pessimization, though. – zxq9 Dec 14 '16 at 23:43
  • @zxq9: Sorry, I should have used the word "slow" rather than "inefficient" there – Mark K Cowan Dec 15 '16 at 00:28
  • 2
    @MarkKCowan "Slow" is a relative term. In large business systems and game servers where we can accept ~50ms to return a query, 20 table joins have never been a problem (in Postgres 8+, anyway) in my experience. But in cases where management wants <1ms responses to >10b row joins across 5+ tables on unindexed data (or derived values!)... no system in the world will feel "fast" other than making this join last month and stashing it in a fast K/V store (which is essentially what a materialized view can act as in special circumstances). Can't escape a tradeoff at either write or read time. – zxq9 Dec 15 '16 at 08:56
8

Inheritance can be used in an OOP paradigm as long as you do not need to create foreign keys on the parent table. By example, if you have an abstract class vehicle stored in a vehicle table and a table car that inherits from it, all cars will be visible in the vehicle table but a foreign key from a driver table on the vehicle table won't match theses records.

Inheritance can be also used as a partitionning tool. This is especially usefull when you have tables meant to be growing forever (log tables etc).

  • 1
    Table constraints are not inherited, so it's more than just foreign keys. You can apply the table constraints on the child table(s) as they are created in your DDL, or you can write triggers to effect the same constraints. – Wexxor Jul 27 '12 at 18:23
3

Main use of inheritance is for partitioning, but sometimes it's useful in other situations. In my database there are many tables differing only in a foreign key. My "abstract class" table "image" contains an "ID" (primary key for it must be in every table) and PostGIS 2.0 raster. Inherited tables such as "site_map" or "artifact_drawing" have a foreign key column ("site_name" text column for "site_map", "artifact_id" integer column for the "artifact_drawing" table etc.) and primary and foreign key constraints; the rest is inherited from the the "image" table. I suspect I might have to add a "description" column to all the image tables in the future, so this might save me quite a lot of work without making real issues (well, the database might run little slower).

EDIT: another good use: with two-table handling of unregistered users, other RDBMSs have problems with handling the two tables, but in PostgreSQL it is easy - just add ONLY when you are not interrested in data in the inherited "unregistered user" table.

Community
  • 1
  • 1
Pavel V.
  • 2,653
  • 10
  • 43
  • 74
2

The only experience I have with inherited tables is in partitioning. It works fine, but it's not the most sophisticated and easy to use part of PostgreSQL.

Last week we were looking the same OOP issue, but we had too many problems with Hibernate - we didn't like our setup, so we didn't use inheritance in PostgreSQL.

Utsav Shah
  • 35
  • 4
Frank Heikens
  • 117,544
  • 24
  • 142
  • 135
1

I use inheritance when I have more than 1 on 1 relationships between tables.

Example: suppose you want to store object map locations with attributes x, y, rotation, scale.

Now suppose you have several different kinds of objects to display on the map and each object has its own map location parameters, and map parameters are never reused.

In these cases table inheritance would be quite useful to avoid having to maintain unnormalised tables or having to create location id’s and cross referencing it to other tables.

Maarten
  • 11
  • 1
0

I tried some operations on it, I will not point out if is there any actual use case for database inheritance, but I will give you some detail for making your decision. Here is an example of PostgresQL: https://www.postgresql.org/docs/15/tutorial-inheritance.html You can try below SQL script.

CREATE TABLE IF NOT EXISTS cities (
  name       text,
  population real,
  elevation  int     -- (in ft)
);

CREATE TABLE IF NOT EXISTS capitals (
  state      char(2) UNIQUE NOT NULL
) INHERITS (cities);

ALTER TABLE cities
ADD test_id varchar(255); -- Both table would contains test col
DROP TABLE cities; -- Cannot drop because capitals depends on it

ALTER TABLE cities
ADD CONSTRAINT fk_test FOREIGN KEY (test_id) REFERENCES sometable (id);

As you can see my comments, let me summarize:

  • When you add/delete/update fields -> the inheritance table would also be affected.
  • Cannot drop the parent table.
  • Foreign keys would not be inherited.

From my perspective, in growing applications, we cannot easily predict the changes in the future, for me I would avoid applying this to early database developing.

When features are stable as well and we want to create some database model which much likely the same as the existing one, we can consider that use case.

Justin Dang
  • 234
  • 3
  • 7
-4

Use it as little as possible. And that usually means never, it boiling down to a way of creating structures that violate the relational model, for instance by breaking the information principle and by creating bags instead of relations.

Instead, use table partitioning combined with proper relational modelling, including further normal forms.

Leandro
  • 169
  • 11
  • 4
    It is not true that PostgreSQLs inheritance feature violates the relational model by breaking the information principle. The information principle says, that all data in a relational database is represented by data values in relations and all query results are again represented as a relation.(https://en.wikipedia.org/wiki/Relational_model) This is always the case, since all tables, that inherit another table, are simple tables again. For that reason there is also no such thing as a "bag", whatever that means. – Roland Jan 17 '17 at 10:34
  • 2
    Well, Wikipedia is hardly a reference as regards the relational model; it refuses to recognise SQL violates the relational model. A bag is a table without a key, because potentially it has duplicates, thus not being a relation; a relation must be a set. – Leandro Jan 18 '17 at 21:17
  • That is not a problem of the feature itself, but how it is used. If you work with uuids as identifiers, you will have unique keys over all sub-tables. – Roland Jan 19 '17 at 09:34
  • You have a point, but the issue here is that inheritance leads the modeller to ignore the relational model. UUIDs are not real keys, but surrogate ones. One still has to declare natural keys. – Leandro Sep 12 '18 at 11:48
  • 1
    Indeed, inheritance is currently too limited to be in any way useful. Not being able to index across the whole hierarchy is a no-go. – alecov Mar 09 '23 at 19:03