10

Not a long ago I started to optimize queries for mysql db. I created some indexes and decided to see their params, using show index from syntax. And saw that cardinality doesn't equal distinct count for some columns.

So, I have the table

CREATE TABLE `item_owners` (
  `uid` varchar(255) NOT NULL,
  `version` bigint(20) NOT NULL,
  `type_id` varchar(255) NOT NULL,
  `owner_name` varchar(255) NOT NULL,
  `item_id` varchar(255) NOT NULL,
  `status_id` varchar(255) NOT NULL,
  PRIMARY KEY (`uid`),
  KEY `FK181EADBC7346EE24` (`status_id`),
  KEY `FK181EADBC90094D43` (`type_id`),
  KEY `FK181EADBC499E38CA` (`item_id`),
  CONSTRAINT `FK181EADBC499E38CA` FOREIGN KEY (`item_id`) REFERENCES `items` (`uid`),
  CONSTRAINT `FK181EADBC7346EE24` FOREIGN KEY (`status_id`) REFERENCES `statuses` (`uid`),
  CONSTRAINT `FK181EADBC90094D43` FOREIGN KEY (`type_id`) REFERENCES `types` (`uid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

And in that table I have 2M rows(and indeed 2M uniq PK), 500k uniq item_id, 5 uniq status_id and 20 uniq type_id

But when I execute

show index from item_owners

I've got

+-------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table       | Non_unique | Key_name           | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| item_owners |          0 | PRIMARY            |            1 | uid         | A         |     1893427 |     NULL | NULL   |      | BTREE      |         |               |
| item_owners |          1 | FK181EADBC7346EE24 |            1 | status_id   | A         |          17 |     NULL | NULL   |      | BTREE      |         |               |
| item_owners |          1 | FK181EADBC90094D43 |            1 | type_id     | A         |          17 |     NULL | NULL   |      | BTREE      |         |               |
| item_owners |          1 | FK181EADBC499E38CA |            1 | item_id     | A         |      148000 |     NULL | NULL   |      | BTREE      |         |               |
+-------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

Then I decided to execute

analyze table item_owners

And after this output was:

+-------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table       | Non_unique | Key_name           | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| item_owners |          0 | PRIMARY            |            1 | uid         | A         |     2005419 |     NULL | NULL   |      | BTREE      |         |               |
| item_owners |          1 | FK181EADBC7346EE24 |            1 | status_id   | A         |          17 |     NULL | NULL   |      | BTREE      |         |               |
| item_owners |          1 | FK181EADBC90094D43 |            1 | type_id     | A         |          17 |     NULL | NULL   |      | BTREE      |         |               |
| item_owners |          1 | FK181EADBC499E38CA |            1 | item_id     | A         |     2005419 |     NULL | NULL   |      | BTREE      |         |               |
+-------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

Anyone know how it possible? , because I really misunderstand why cardinality doesn't equal number of uniq rows even for PK index.

Andrej Soroj
  • 1,103
  • 1
  • 9
  • 10

1 Answers1

4

Quoting the manual:

Cardinality

An estimate of the number of unique values in the index. This is updated by running ANALYZE TABLE or myisamchk -a. Cardinality is counted based on statistics stored as integers, so the value is not necessarily exact even for small tables.

RandomSeed
  • 29,301
  • 6
  • 52
  • 87
  • Thanks for your answer. But it is not very convenient to run ANALYZE every time when index has wrong cardinality. – Andrej Soroj May 24 '13 at 15:32
  • 1
    What is the purpose of getting an exact cardinality? With regards to query optimisation, it allows the optimizer to correctly decide whether or not it is worth using the index (in comparison with a table scan). A slight error should not have big consequences because in case of doubt, the two approaches are likely to take the same amount of time. If you need the exact number (eg. in your application), just issue a query like `SELECT COUNT(*)`. – RandomSeed May 24 '13 at 15:42
  • 4
    "Cardinality is counted based on statistics stored as integers" not sure how that logically follows to the value not being necessarily exact even for small tables. – K2xL Feb 23 '17 at 16:01