1

I discovered something strange today when I was trying to select sum from an ENUM field. It was adding 1 to the values! A sample table I used is shown below:

x     y
_______
3  |  2
0  |  1

Both x and y are ENUM ('0','1','2','3')

My query is as follows:

select sum(x), sum(y), sum(x+y) from myfield

And the result is:

5    5   10

This is very strange. Does it mean that is this behaviour consistent? Can I use something like:

select sum(x - 1), sum(y - 1), sum((x-1)+(y-1)) from myfield

which will produce the correct results or is this behavior only peculiar to my db server in particular?

Thanks.

EDIT: For those wondering why I'm using ENUM, it's because the actual field I'm using contains 'AR' which can't fit in into tinyint.

rtuner
  • 2,362
  • 3
  • 25
  • 37
  • 2
    Why are you trying to sum ENUMs ? It doesn't make sense from a semantic point of view. Also, I don't see the point in defining an ENUM which values are string representations of integer numbers... just to sum them up as numbers! That's twisted IMHO. :) – Marcello Romani Jun 27 '13 at 21:30
  • 1
    Answers here http://stackoverflow.com/questions/259547/how-does-mysql-store-enums and here http://dev.mysql.com/doc/refman/5.0/en/enum.html – Marcello Romani Jun 27 '13 at 21:32

2 Answers2

3

As an ENUM field is really just an INT UNSIGNED, it will not work as you expect, if you use integer values for the ENUMs. For example, your ENUM of '0', is stored internally as a numeric 1, and your '1', is stored as a numeric 2. Surprisingly, the empty string '' is stored internally as a 0. Here's an example of this not working as expected:

mysql> CREATE TABLE enumtest (
    -> id int unsigned NOT NULL AUTO_INCREMENT PRIMARY KEY,
    -> a ENUM ('0','1','2','3') NOT NULL DEFAULT '0',
    -> i int unsigned NOT NULL DEFAULT 0
    -> );
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO enumtest SET a = 0, i=0;
Query OK, 1 row affected, 1 warning (0.01 sec)

mysql> SHOW WARNINGS;
+---------+------+----------------------------------------+
| Level   | Code | Message                                |
+---------+------+----------------------------------------+
| Warning | 1265 | Data truncated for column 'a' at row 1 |
+---------+------+----------------------------------------+
1 row in set (0.00 sec)

mysql> INSERT INTO enumtest SET a = '0', i=0;
Query OK, 1 row affected (0.01 sec)

mysql> INSERT INTO enumtest SET a = 1, i=1;
Query OK, 1 row affected (0.01 sec)

mysql> INSERT INTO enumtest SET a = '1', i=1;
Query OK, 1 row affected (0.00 sec)

mysql> SELECT a,0+a,i FROM enumtest;                   
+---+-----+---+
| a | 0+a | i |
+---+-----+---+
|   |   0 | 0 |
| 0 |   1 | 0 |
| 0 |   1 | 1 |
| 1 |   2 | 1 |
+---+-----+---+
4 rows in set (0.00 sec)

mysql> SELECT SUM(a),SUM(0+a), SUM(i) FROM enumtest;    
+--------+----------+--------+
| SUM(a) | SUM(0+a) | SUM(i) |
+--------+----------+--------+
|      4 |        4 |      2 |
+--------+----------+--------+
1 row in set (0.00 sec)

The clause 0+a coerces the ENUM to its underlying UNSIGNED value. This is equivalent to CAST(a AS UNSIGNED).

Ross Smith II
  • 11,799
  • 1
  • 38
  • 43
2

This is because doing SUM() on an enum column doesn't do what you think it does. The data stored for an enum column type is the index into the enum, that index starts at 1, and it's this index that mysql will SUM()

Your table looks like this when displayed:

x     y
_______
3  |  2
0  |  1

This is showing your enum values - you just happened to define your enum values to be numbers too. You could have defined them to be e.g. ENUM ('blue','red','green','yellow') And it would have looked like:

x         y
_______________
yellow  |  green
blue    |  red

However, this is for display only. What is actually stored in the rows for this table is the index in the enum.

The elements listed in the column specification are assigned index numbers, beginning with 1.

So those indexes starts at 1. it's these underlying data that SUM() and other aggregate functions works at for an ENUM column. There's no implicit conversion to the enum values, which you defined as numbers too.

i.e. the data stored is these indexes:

x     y
_______
4  |  3
1  |  2

And while it doesn't really make sense to SUM enums, it is those indexes that mysql would aggregate when using SUM()

The docs is a must read.

nos
  • 223,662
  • 58
  • 417
  • 506
  • Well, I am using PhpMyAdmin which suggested I should optimize my DB in that manner. So from your reply, I guess it's safe to subtract by 1, right? – rtuner Jun 27 '13 at 21:48
  • @rtuner In your particular case, you could. But using SUM() here will cause confusion for everyone, probably including yourself, that will look at this code in the future. If you want to store numbers, use an int/tinyint or similar column. Don't use an enum. – nos Jun 28 '13 at 13:49
  • I've been seriously thinking about it, it's very difficult to decide what to do since the fields that generate the information can contain non-numeric values as well. Best choice will probably be to use -3,-2, and -1 for the non-numeric values. – rtuner Jul 01 '13 at 14:47
  • "that index starts at 1" is not correct. This index starts at 0 for empty string. – Ross Smith II Jan 17 '14 at 17:33
  • @RossSmithII The index of the enums you specify starts at 1. You're right that there's an index 0 too, the empty string. – nos Jan 17 '14 at 17:41
  • Leaving this misstatement in your answer may confuse people. Care to update it? – Ross Smith II Mar 24 '14 at 16:24