46

Where I work we use a postgres database (8.3 soon to migrate to 8.4). There is a small debate currently on the use of enums in the database. Personally I do not like the db enum type. Among other things it puts application logic in the database and creates a posibility for a mismatch between code and data.

I was wondering what exactly are the advantages of postgres enums (besides readablity) and what are the disadvatages?

Rodrigo
  • 483
  • 8
  • 14
Jasper Floor
  • 4,632
  • 6
  • 26
  • 21
  • 1
    Why is it bad to have business (/application) logic in DBMS? The benefits are easy: if you ever need to work with your data directly via JDBC connection or use another Front-End, while your business logic is in DBMS - you don't have to do any changes, you will not have to get wet (break DRY). – Mihail Gershkovich Apr 20 '21 at 20:27

7 Answers7

68

The advantages of enums are:

  • Performance is better. You can just display what you get out of the core table instead of either having a separate lookup table that translates a code to a value or having app logic that translates a code to a value. This can be especially useful in datawarehouse applications.
  • Ad hoc SQL is easier to write

The disadvantages are:

  • Encoding display values into your database ddl is bad form. If you translate the enum value into a different display value in your app code, then you lost a lot of the advantages of using enums.
  • Adding values requires DDL changes
  • Makes language localization difficult
  • Database portability is decreased
Mark
  • 5,499
  • 34
  • 29
  • 15
    If you're defining values that control will modify the behaviour of your application, then surely "Adding values requires DDL changes" is an advantage? Otherwise you're increasing the risk of apparently innocuous database operations breaking your application. –  Jul 15 '11 at 02:57
  • I wish I would be able to implement a multi-lang ENUM.... but I'm not a C[++/#, whatsoever] programmer and barely understand the source code of PostgreSQL. However if there would be such a thing - I would totally love it! And there is one more benefit: you can use an ENUM for the PK key in a lookup table and respective FKs. If you ever have to change the value of the lookup key, you just change the ENUM and DBMS does not need to cascade the update to all FK fields. – Mihail Gershkovich Apr 20 '21 at 20:17
  • 1
    It may be worth noting that the performance is better versus a JOIN on another table to get the display value. It's not better versus a text field. (See my answer) – Jeff Aug 18 '21 at 15:10
  • 1
    @Jeff, the performance is the same in comparison to a text field, if you the amount of pages is same and you can leverage index. If you need to perform a full table search, or if you fetch lots of entries it might have a huge impact. It might be a lower effect for the final transaction (between DBMS and client), but pretty high effect for processes inside DBMS. – Mihail Gershkovich Sep 30 '21 at 13:19
11

Enums combine the advantages of ints with the advantages of strings: they are small and fast like ints, readable like strings, and have the additional advantage of being safe (you can't mis-spell an enum).

However, if you don't care about readability, an int is as good as an enum.

Denise
  • 1,947
  • 2
  • 17
  • 29
10

In PostgreSQL 13, btree indexes now support deduplication. If we take the following real-world example of using an ENUM to represent HTTP methods in a log table with 100 million rows:

public | test_http_enum_idx | index | postgres | test | permanent   | 789 MB  | 
public | test_http_test_idx | index | postgres | test | permanent   | 789 MB  | 

We can see that the index size for both are the same. For a denormalized table, saving a few bytes per row doesn't really make up for the disadvantages.

Rule of thumb for PG 13+: Use ENUMs to constrain a column to a fixed/static set of values; do not to use them to save on disk space.

Possible exception: If an ENUM of static values will help you avoid a costly JOIN or a FK --- go for it; just make sure to avoid premature optimization and measure your results in production.

When making your decision, consider that popular BI tools like Metabase did not support filtering on an ENUM, however, they'll work just fine on a TEXT column. @solaris: Reports that Metabase 0.42.1 and higher support filtering on ENUM values.

Jeff
  • 458
  • 6
  • 12
  • But aren't the ENUM fields processed like text fields in BI tools? And what has the index size to do with real table size? – Mihail Gershkovich Apr 20 '21 at 20:20
  • @MihailGershkovich: Indexes and tables both occupy real disk space but are reported separately. You may see a representation of a table's total size which may include the size of the indexes depending on the underlying query. Every BI tool is different, some will treat an ENUM differently and provide a "drop down" instead of a text field when it comes to search and filtering, others may not handle it at all like Metabase. – Jeff Aug 18 '21 at 15:08
  • We have some tables which decreased their real size from ~64 GB to <21, by using ENUMS, and we killed some lookup tables. The final table is more normalized now, and the queries became much faster (due to decreased size of tables). If you look at queries which use index only, it will not matter. We have a mixed OLTP+OLAP environment, and even for OLTP we often fetch hundreds of lines at once. ENUMS created real difference for us. – Mihail Gershkovich Sep 30 '21 at 13:15
  • @MihailGershkovich: Interesting. What version of PostgreSQL was this with? Had you dropped/recreated indexes to take advantage of the new BTREE deduplication? – Jeff Jan 06 '22 at 19:21
  • 1
    I just successfully filtered by an ENUM column in Metabase (v0.42.1). It provided me a multi-select dropdown for filtering so seems to have proper value discovery too. – solaris May 03 '22 at 01:24
  • @solaris: thanks I updated my answer to reflect this. – Jeff May 06 '22 at 14:46
5

As advantage you have also DB checking, that nothing else enum value couldn't be recorded in column. The big disadvantage for me was, that enum could be modified only by adding value to the end, but since Postgres 9.1 it is in the past: https://stackoverflow.com/a/7834949/548473

Community
  • 1
  • 1
Grigory Kislin
  • 16,647
  • 10
  • 125
  • 197
1

Advantages

  1. Reduce Storage: Postgres uses only 1 byte per tuple when 255 or less ENUM elements are defined or 2 bytes for 256~65535 elements. This is because, rather that storing the constant literal of the value, Postgres stores the index in the ordered set of that value. For very large tables, this might prove to be a significant storage space save.

  2. Arbitrary sorting:

CREATE TABLE opening_hours(
    week_day ENUM ('Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday'),
    opening_time TIME,
    closing_time TIME
);

If you sort by week_day it will sort in the order you specified which is convenient in the above case.

  1. Cheap constraints: Instead of doing checking in your application code or some complicated db constraint, enums check that only certain values are added in a cheap way.

Disadvantages

  • The list of options CANNOT be controlled by end-users since ENUM is part of the schema
  • An additional query is required to see the list of options
  • String operations and functions do not work on ENUM This is due to ENUM being a separate data type from the built-in data types like NUMERIC or TEXT. This can be overcome by casting ENUM values to TEXT when being operated. However, it can be a pain when using ORM.
villy393
  • 2,985
  • 20
  • 28
  • If you understand web-app or GUI users by end-users - where is the hassle to create a simple applet, which can manage the ENUMs? If you are talking about DBMS roles, where is the issue to grant them access to do so? The only excuse is lack of knowledge... but if that's the issue, probably one should not work with data at all. – Mihail Gershkovich Apr 20 '21 at 20:25
  • 3
    The docs for Postgres 14 says the enum type takes 4 bytes of disk space: https://www.postgresql.org/docs/14/datatype-enum.html – Morteza Milani Dec 15 '21 at 07:39
  • The advantages portion of this answer appears to be out of date as of PG 13. There's some good points in the Disadvantages. – Jeff Sep 22 '22 at 14:22
0

I prefer a text field with a foreign key to either an ENUM field or a text field without a foreign key.

Advantages of a text field with a foreign key:

  • Lookups return a text value without needing a join.
  • Inserts enforce referential integrity.
  • The table defining the possible values for your foreign key can be easily modified.

Disadvantages:

  • A text field requires additional storage compared to an ENUM field. In most cases, this is unlikely to be an issue.
  • An insert requires an additional lookup to enforce referential integrity. In most cases, this is unlikely to be an issue.

Example:

create table example_table_example_type (
  example_type text primary key
);

create table example_table (
  example_type text not null references example_table_example_type(example_type),
  ...other fields...
);
Monica For CEO
  • 499
  • 6
  • 20
-4

Point is, if applications are allowed to do DDL, they are more likely to cause blocking or conflict. DDL is best done offline i.e. in single-user mode.

  • 2
    This response has nothing to do with the real question here. And even less with PostgreSQL. PostgreSQL is quite good in running DDL while running simple transactions. And actually many other databases aswell. You just have to know how to do it. – Mihail Gershkovich Sep 30 '21 at 13:20