144

What is the best way to save enums into a database?

I know Java provides name() and valueOf() methods to convert enum values into a String and back. But are there any other (flexible) options to store these values?

Is there a smart way to make enums into unique numbers (ordinal() is not safe to use)?

Update

Thanks for all awesome and fast answers! It was as I suspected.

However, a note to toolkit: That is one way. The problem is that I would have to add the same methods to each enum type that I create. That's a lot of duplicated code and, at the moment, Java does not support any solutions for this (a Java enum cannot extend other classes).

Adil Hussain
  • 30,049
  • 21
  • 112
  • 147
user20298
  • 1,795
  • 3
  • 16
  • 10

11 Answers11

190

We never store enumerations as numerical ordinal values anymore; it makes debugging and support way too difficult. We store the actual enumeration value converted to string:

public enum Suit { Spade, Heart, Diamond, Club }

Suit theSuit = Suit.Heart;

szQuery = "INSERT INTO Customers (Name, Suit) " +
          "VALUES ('Ian Boyd', %s)".format(theSuit.name());

and then read back with:

Suit theSuit = Suit.valueOf(reader["Suit"]);

The problem was in the past staring at Enterprise Manager and trying to decipher:

Name          Suit
------------  ----
Kylie Guénin  2
Ian Boyd      1

verses

Name          Suit
------------  -------
Kylie Guénin  Diamond
Ian Boyd      Heart

the latter is much easier. The former required getting at the source code and finding the numerical values that were assigned to the enumeration members.

Yes it takes more space, but the enumeration member names are short, and hard drives are cheap, and it is much more worth it to help when you're having a problem.

Additionally, if you use numerical values, you are tied to them. You cannot nicely insert or rearrange the members without having to force the old numerical values. For example, changing the Suit enumeration to:

public enum Suit { Unknown, Heart, Club, Diamond, Spade }

would have to become :

public enum Suit { 
      Unknown = 4,
      Heart = 1,
      Club = 3,
      Diamond = 2,
      Spade = 0 }

in order to maintain the legacy numerical values stored in the database.

How to sort them in the database

The question comes up: lets say i wanted to order the values. Some people may want to sort them by the enum's ordinal value. Of course, ordering the cards by the numerical value of the enumeration is meaningless:

SELECT Suit FROM Cards
ORDER BY SuitID; --where SuitID is integer value(4,1,3,2,0)

Suit
------
Spade
Heart
Diamond
Club
Unknown

That's not the order we want - we want them in enumeration order:

SELECT Suit FROM Cards
ORDER BY CASE SuitID OF
    WHEN 4 THEN 0 --Unknown first
    WHEN 1 THEN 1 --Heart
    WHEN 3 THEN 2 --Club
    WHEN 2 THEN 3 --Diamond
    WHEN 0 THEN 4 --Spade
    ELSE 999 END

The same work that is required if you save integer values is required if you save strings:

SELECT Suit FROM Cards
ORDER BY Suit; --where Suit is an enum name

Suit
-------
Club
Diamond
Heart
Spade
Unknown

But that's not the order we want - we want them in enumeration order:

SELECT Suit FROM Cards
ORDER BY CASE Suit OF
    WHEN 'Unknown' THEN 0
    WHEN 'Heart'   THEN 1
    WHEN 'Club'    THEN 2
    WHEN 'Diamond' THEN 3
    WHEN 'Space'   THEN 4
    ELSE 999 END

My opinion is that this kind of ranking belongs in the user interface. If you are sorting items based on their enumeration value: you're doing something wrong.

But if you wanted to really do that, i would create a Suits dimension table:

Suit SuitID Rank Color
Unknown 4 0 NULL
Heart 1 1 Red
Club 3 2 Black
Diamond 2 3 Red
Spade 0 4 Black

This way, when you want to change your cards to use Kissing Kings New Deck Order you can change it for display purposes without throwing away all your data:

Suit SuitID Rank Color CardOrder
Unknown 4 0 NULL NULL
Spade 0 1 Black 1
Diamond 2 2 Red 1
Club 3 3 Black -1
Heart 1 4 Red -1

Now we are separating an internal programming detail (enumeration name, enumeration value) with a display setting meant for users:

SELECT Cards.Suit 
FROM Cards
   INNER JOIN Suits ON Cards.Suit = Suits.Suit
ORDER BY Suits.Rank, 
   Card.Rank*Suits.CardOrder
    
Ian Boyd
  • 246,734
  • 253
  • 869
  • 1,219
  • 23
    toString is often overriden to provide display value. name() is a better choice as it's by definition the counterpart of valueOf() – ddimitrov Oct 23 '08 at 14:40
  • Worht noting that enum classes can be extended and you can add fields / methods. I've used this to make enums which values which correspond to bit fields (1,2,4,8...) so they can be OR'd together. – basszero Oct 23 '08 at 18:06
  • What is this "name()" method you speak of? What is this "valueOf()" method you speak of? The only way i can find to convert an enumeration member to a string is with the .ToString() method of the enumeration variable. – Ian Boyd Oct 23 '08 at 20:22
  • @anonymousstackoverflowuser.openid.org: See http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Enum.html#name() – flicken Oct 23 '08 at 21:45
  • Excellent implementation and answer. Many thanks for the advice – Helios Dec 28 '08 at 08:03
  • 10
    I strongly disagree with this, if enum persistence is required then should not persist names. as far as reading it back goes it is even simpler with value instead of name can just typecast it as SomeEnum enum1 = (SomeEnum)2; – mamu Sep 03 '09 at 03:49
  • 3
    mamu: What happens when the numeric equivalents change? – Ian Boyd Sep 04 '09 at 15:34
  • @basszero If in .Net, for flags you want to use the [Flags] attribute: http://msdn.microsoft.com/en-us/library/system.flagsattribute(VS.71).aspx – xanadont Mar 05 '10 at 22:01
  • Another vote for names, and we were able to run delta scripts to convert old numerical values to the newer name-based (using MySQL, so that helped). – Nick Spacek Dec 01 '11 at 13:26
  • 4
    I would discourage anyone using this approach. Tying yourself to string representation limits code flexibility and refactoring. You should better use unique ids. Also storing strings wastes storage space. – Tautvydas Apr 18 '14 at 18:57
  • and if need to query the enum you can use Enum.valueOf(Suit .class, c.getString(c.getColumnIndex("Suit"))) – Omid Aminiva Nov 24 '15 at 18:13
  • What about sorting? If I want to sort results in the query by the stored enum, when choosing the String-representation it would be implicitly defined by the names of the enumeration, whereas if we implement an int-value for each enumeration value, the ordering is left to the programmer. – SebastianRiemer Aug 23 '18 at 06:45
  • @SebastianRiemer The ordering isn't left to the programmer. If the programmer wanted to insert a new enumeration value in the middle: i can't ; not without breaking all existing values. But say the programmer did mistakenly save the int values of the enum, and how you wanted to do sorting. If i want to sort the results in the query, but i only have access to the corresponding int value: how would i do it? Simple: `ORDER BY CASE WidgetID WHEN 1 THEN 1 WHEN 2 THEN 3 WHEN 3 THEN 2 END`. Same for by name. – Ian Boyd Aug 23 '18 at 18:13
  • To clarify consider this example, enumeration with these values: apple, banana, grapefruit, orange; when using the name we can either sort ASC or DESC; ASC: apple, banana, grapefruit, orange; DESC: orange, grapefruit, banana, apple; whereas if i choose to give them a dedicated integer-value (by constructor and member field of type int) like this: grapefruit(1), banana(2), orange(3), apple(4) i am free to choose which ordering I want to apply; inserting new values isn't regarded in my comment and this example – SebastianRiemer Aug 24 '18 at 06:44
  • @SebastianRiemer Yes, if you insert strings: you cannot sort them as integers. – Ian Boyd Aug 24 '18 at 19:18
  • In your example Suit theSuit = Suit.valueOf(reader["Suit"]); you probably meant Suit theSuit = Suit.valueOf(reader["Spade"]); or smth? – Andrey M. Stepanov Jan 16 '19 at 13:37
  • @AndreyM.Stepanov The database doesn't contain a column named `Spade` – Ian Boyd Jan 16 '19 at 15:33
  • I agree with this solution because it improves readability. I also agree that database space isn't an issue here. However, I am surprised nobody discussed the fact that filtering by an INT is quicker than filtering a varchar(50). This is for me the only weak spot of this solution. – Luis Gouveia Oct 21 '19 at 10:53
  • @LuisGouveia I'd be curious to see the database system that has anything more than theoretical performance issues filtering on `varchar(50)` as opposed to `int`. – Ian Boyd Oct 21 '19 at 14:05
  • @IanBoyd, don't take it personally. Like I told you, your solution is the best one. However, if you doubt my comment, you can see the following accepted answer: https://stackoverflow.com/questions/2346920/sql-select-speed-int-vs-varchar – Luis Gouveia Oct 21 '19 at 14:24
  • @LuisGouveia I wasn't doubting a *theoretical* difference. I was looking at a *practical* difference. In other words: it's a premature micro-optimization whose extra time is 'in the noise' as we say. See the other answer - the one with actual benchmarks (https://stackoverflow.com/a/39650204/12597). – Ian Boyd Oct 21 '19 at 14:52
  • @IanBoyd, I agree with you, the difference is small, but only negligeable if the size of the columns are similar, which is the case in the example you're showing me. However, a code can easily use a varchar(50) - 54 bytes, which is not comparable with a bigint - 8 bytes. I believe the time spent would double (or more) if in the example given we would use a 50 char string, but please tell me if you believe I'm wrong. – Luis Gouveia Oct 22 '19 at 14:23
  • 3
    @LuisGouveia I agree with you that the time could double. Causing a query that takes `12.37 ms` to instead take `12.3702 ms`. That's what i mean by *"in the noise"*. You run the query again and it takes `13.29 ms`, or `11.36 ms`. In other words, the randomness of the thread scheduler will drastically swamp any micro optimization you theoretically have that is in no way visible to anyone in any way ever. – Ian Boyd Oct 22 '19 at 14:31
  • @IanBoyd, I see your point. I thought the difference was small but not negligeable. Your point is clear: the difference is in fact negligeable. Thank you! I think the whole point of our discussion was resumed in this thread and its accepted answer: https://stackoverflow.com/questions/183201/should-a-developer-aim-for-readability-or-performance-first – Luis Gouveia Oct 22 '19 at 14:36
  • Anyone ever think of what a junior could do in this code without knowing? Changing the order of enums seems benign and wouldn't really be caught in code review. Changing an existing name of an ENUM that is known to be stored in the db is something that will be blatant in code review and will be addressed. Have to think more on the side of maintainability in this case, because the downside of getting the ENUM type wrong could be disastrous. Especially if the update is released, and new data (with new ordinals) is mixed with old data (with previous ordinals), good luck fixing that! – Zoidberg Dec 13 '19 at 17:09
  • I'd also like to add that ORMs typically won't even LOAD existing data that has a mismatched enum name and you'll know immediately should the name of an enum change when it goes to production, where as an ordinal change can hum along silently causing untold damage to your data. – Zoidberg Dec 13 '19 at 17:12
48

Unless you have specific performance reasons to avoid it, I would recommend using a separate table for the enumeration. Use foreign key integrity unless the extra lookup really kills you.

Suits table:

suit_id suit_name
1       Clubs
2       Hearts
3       Spades
4       Diamonds

Players table

player_name suit_id
Ian Boyd           4
Shelby Lake        2
  1. If you ever refactor your enumeration to be classes with behavior (such as priority), your database already models it correctly
  2. Your DBA is happy because your schema is normalized (storing a single integer per player, instead of an entire string, which may or may not have typos).
  3. Your database values (suit_id) are independent from your enumeration value, which helps you work on the data from other languages as well.
Ian Boyd
  • 246,734
  • 253
  • 869
  • 1,219
Tom
  • 10,689
  • 4
  • 41
  • 50
  • 17
    While I agree it is nice to have it normalized, and constrained in the DB, this does cause updates in two places to add a new value (code and db), which might cause more overhead. Also, spelling mistakes should be nonexistent if all updates are done programatically from the Enum name. – Jason Oct 13 '09 at 18:23
  • 3
    I agree with the comment above. An alternative enforcement mechanism at the database level would be to write a constraint trigger, which would reject inserts or updates that try to use an invalid value. – Steve Perkins Jun 21 '12 at 16:11
  • 1
    Why would I want to declare the same information in two places? Both in CODE `public enum foo {bar}` and `CREATE TABLE foo (name varchar);` that can easily get out of sync. – ebyrob Nov 11 '16 at 16:17
  • If we take the accepted answer at face value, that is that the enum names are only used for manual investigations, then this answer is indeed the best option. Also, if you go on changing enumeration order or values or names, you will always have much more problems than maintaining this extra table. Especially when you only need it (and may choose to create only temporarily) for debugging and support. – Tadas S Jun 01 '17 at 17:51
8

I have faced the same issue where my objective is to persist Enum String value into database instead of Ordinal value.

To over come this issue, I have used @Enumerated(EnumType.STRING) and my objective got resolved.

For Example, you have an Enum Class:

public enum FurthitMethod {

    Apple,
    Orange,
    Lemon
}

In the entity class, define @Enumerated(EnumType.STRING):

@Enumerated(EnumType.STRING)
@Column(name = "Fruits")
public FurthitMethod getFuritMethod() {
    return fruitMethod;
}

public void setFruitMethod(FurthitMethod authenticationMethod) {
    this.fruitMethod= fruitMethod;
}

While you try to set your value to Database, String value will be persisted into Database as "APPLE", "ORANGE" or "LEMON".

Unheilig
  • 16,196
  • 193
  • 68
  • 98
SaravanaC
  • 81
  • 1
  • 3
6

As you say, ordinal is a bit risky. Consider for example:

public enum Boolean {
    TRUE, FALSE
}

public class BooleanTest {
    @Test
    public void testEnum() {
        assertEquals(0, Boolean.TRUE.ordinal());
        assertEquals(1, Boolean.FALSE.ordinal());
    }
}

If you stored this as ordinals, you might have rows like:

> SELECT STATEMENT, TRUTH FROM CALL_MY_BLUFF

"Alice is a boy"      1
"Graham is a boy"     0

But what happens if you updated Boolean?

public enum Boolean {
    TRUE, FILE_NOT_FOUND, FALSE
}

This means all your lies will become misinterpreted as 'file-not-found'

Better to just use a string representation

toolkit
  • 49,809
  • 17
  • 109
  • 135
5

I would argue that the only safe mechanism here is to use the String name() value. When writing to the DB, you could use a sproc to insert the value and when reading, use a View. In this manner, if the enums change, there is a level of indirection in the sproc/view to be able to present the data as the enum value without "imposing" this on the DB.

oxbow_lakes
  • 133,303
  • 56
  • 317
  • 449
5

We just store the enum name itself. It's more readable.

We did mess around with adding an additional property to the enum where the enum has a limited set of values. For example, in the following enum, we use a char property to represent the enum value in the database (a char is more meaningful than a numeric value):

public enum EmailStatus {
    EMAIL_NEW('N'), EMAIL_SENT('S'), EMAIL_FAILED('F'), EMAIL_SKIPPED('K'), UNDEFINED('-');

    private char dbChar = '-';

    EmailStatus(char statusChar) {
        this.dbChar = statusChar;
    }

    public char statusChar() {
        return dbChar;
    }

    public static EmailStatus getFromStatusChar(char statusChar) {
        switch (statusChar) {
        case 'N':
            return EMAIL_NEW;
        case 'S':
            return EMAIL_SENT;
        case 'F':
            return EMAIL_FAILED;
        case 'K':
            return EMAIL_SKIPPED;
        default:
            return UNDEFINED;
        }
    }
}

And when you have a lot of values, you can have a Map inside your enum to keep that getFromXYZ method small.

Adil Hussain
  • 30,049
  • 21
  • 112
  • 147
JeeBee
  • 17,476
  • 5
  • 50
  • 60
  • If you don't want to maintain a switch statement and can ensure that dbChar is unique you could use something like: public static EmailStatus getFromStatusChar(char statusChar) { return Arrays.stream(EmailStatus.values()) .filter(e -> e.statusChar() == statusChar) .findFirst() .orElse(UNDEFINED); } – Kuchi Nov 24 '16 at 09:51
  • An additional and important benefit of adding a property like `dbChar` to the enum is that it allows the enum values to be renamed in the future without consequences. If the enum's `name()` and `valueOf()` methods are used to save and read the enum values to/from the database, then renaming the enum values in the future is not possible without rework. – Adil Hussain Aug 05 '22 at 17:52
4

For a large database, I am reluctant to lose the size and speed advantages of the numeric representation. I often end up with a database table representing the Enum.

You can enforce database consistency by declaring a foreign key -- although in some cases it might be better to not declare that as a foreign key constraint, which imposes a cost on every transaction. You can ensure consistency by periodically doing a check, at times of your choosing, with:

SELECT reftable.* FROM reftable
  LEFT JOIN enumtable ON reftable.enum_ref_id = enumtable.enum_id
WHERE enumtable.enum_id IS NULL;

The other half of this solution is to write some test code that checks that the Java enum and the database enum table have the same contents. That's left as an exercise for the reader.

naXa stands with Ukraine
  • 35,493
  • 19
  • 190
  • 259
Roger Durham
  • 103
  • 6
  • 3
    Say the average enumeration name length is 7 characters. Your `enumID` is four bytes, so you have an extra three bytes per row by using names. 3 bytes x 1 million rows is 3MB. – Ian Boyd Dec 01 '11 at 15:06
  • @IanBoyd: But an `enumId` surely fits in two bytes (longer enums are not possible in Java) and most of them fit in a single byte (which some DB support). The saved space is negligible, but the faster comparison and the fixed length should help. – maaartinus Jan 24 '14 at 13:23
2

If saving enums as strings in the database, you can create utility methods to (de)serialize any enum:

   public static String getSerializedForm(Enum<?> enumVal) {
        String name = enumVal.name();
        // possibly quote value?
        return name;
    }

    public static <E extends Enum<E>> E deserialize(Class<E> enumType, String dbVal) {
        // possibly handle unknown values, below throws IllegalArgEx
        return Enum.valueOf(enumType, dbVal.trim());
    }

    // Sample use:
    String dbVal = getSerializedForm(Suit.SPADE);
    // save dbVal to db in larger insert/update ...
    Suit suit = deserialize(Suit.class, dbVal);
Dov Wasserman
  • 2,632
  • 17
  • 14
  • Nice to use this with a default enum value to fall back on in deserialize. For example, catch the IllegalArgEx and return Suit.None. – Jason Oct 13 '09 at 18:21
2

All my experience tells me that safest way of persisting enums anywhere is to use an additional code value or id (some kind of evolution of JeeBee's answer). This could be a nice example of an idea:

enum Race {
    HUMAN ("human"),
    ELF ("elf"),
    DWARF ("dwarf");

    private final String code;

    private Race(String code) {
        this.code = code;
    }

    public String getCode() {
        return code;
    }
}

Now you can go with any persistence referencing your enum constants by its code. Even if you decide to change some of the constant names, you always can save the code value (e.g. DWARF("dwarf") to GNOME("dwarf")).

Ok, dive some more deeper with this conception. Here is some utility method, that helps you find any enum value, but first lets extend our approach.

interface CodeValue {
    String getCode();
}

And let our enum implement it:

enum Race implement CodeValue {...}

This is the time for magic search method:

static <T extends Enum & CodeValue> T resolveByCode(Class<T> enumClass, String code) {
    T[] enumConstants = enumClass.getEnumConstants();
    for (T entry : enumConstants) {
        if (entry.getCode().equals(code)) return entry;
    }
    // In case we failed to find it, return null.
    // I'd recommend you make some log record here to get notified about wrong logic, perhaps.
    return null;
}

And use it like a charm: Race race = resolveByCode(Race.class, "elf")

Adil Hussain
  • 30,049
  • 21
  • 112
  • 147
Metaphore
  • 749
  • 1
  • 7
  • 15
1

Multiple values with OR relation for one, enum field. The concept for .NET with storing enum types in database like a byte or an int and using FlagsAttribute in your code.

http://blogs.msdn.com/b/efdesign/archive/2011/06/29/enumeration-support-in-entity-framework.aspx

Kryszal
  • 1,663
  • 14
  • 20
0

You can use an extra value in the enum constant that can survive both name changes and resorting of the enums:

public enum MyEnum {
    MyFirstValue(10),
    MyFirstAndAHalfValue(15),
    MySecondValue(20);

    public int getId() {
        return id;
    }
    public static MyEnum of(int id) {
        for (MyEnum e : values()) {
            if (id == e.id) {
                return e;
            }
        }
        return null;
    }
    MyEnum(int id) {
        this.id = id;
    }
    private final int id;
}

To get the id from the enum:

int id = MyFirstValue.getId();

To get the enum from an id:

MyEnum e = MyEnum.of(id);

I suggest using values with no meaning to avoid confusion if the enum names have to be changed.

In the above example, I've used some variant of "Basic row numbering" leaving spaces so the numbers will likely stay in the same order as the enums.

This version is faster than using a secondary table, but it makes the system more dependent on code and source code knowledge.

To remedy that, you can set up a table with the enum ids in the database as well. Or go the other way and pick ids for the enums from a table as you add rows to it.

Sidenote: Always verify that you are not designing something that should be stored in a database table and maintained as a regular object though. If you can imagine that you have to add new constants to the enum at this point, when you are setting it up, that's an indication you may be better off creating a regular object and a table instead.

Erk
  • 1,159
  • 15
  • 9