1

I'm working with SQL 5.7 and am trying to partition a table to improve performance over time. I'm new to partitioning and have come across this .

My table stores millions of rows and is used for reporting, so it quite query intensive, in particular, on my minute_rounded_timestamp column. In addition, almost every column has an index.

This is how I'm trying to do a partition:

CREATE TABLE `tlp_payout_partition_report_minute` (
    `id` int(11) NOT NULL AUTO_INCREMENT,
    `minute_rounded_timestamp` int(11) NOT NULL,
    `application_id` bigint(20) NOT NULL,
    `commission` float NOT NULL DEFAULT '0',
    `seller_code` int(11) DEFAULT NULL,
    `tlp_aff_id` varchar(255) DEFAULT NULL,
    `sub_aff_id` varchar(255) DEFAULT NULL,
    `application_processing_duration_secs` int(11) DEFAULT NULL,
    `bank_hash` varchar(20) DEFAULT NULL,
    `application_result` varchar(20) DEFAULT NULL,
    `user_id` int(11) NOT NULL,
    `created` datetime DEFAULT NULL,
  `modified` datetime DEFAULT NULL,
    PRIMARY KEY (`id`) USING BTREE,
    KEY `minute_rounded_timestamp` (`minute_rounded_timestamp`) USING BTREE,
    KEY `application_id` (`application_id`) USING BTREE,
    KEY `user_id` (`user_id`) USING BTREE,
    KEY `seller_code` (`seller_code`) USING BTREE,
    KEY `tlp_aff_id` (`tlp_aff_id`) USING BTREE,
    KEY `sub_aff_id` (`sub_aff_id`) USING BTREE,
    KEY `application_result` (`application_result`) USING BTREE,
    KEY `created` (`created`),
  KEY `modified` (`modified`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
PARTITION BY RANGE( TO_DAYS(id, created) ) (
  PARTITION p20110401 VALUES LESS THAN ( TO_DAYS('2011-04-02') ),
  PARTITION p20110402 VALUES LESS THAN ( TO_DAYS('2011-04-03') ),
  PARTITION p20110403 VALUES LESS THAN ( TO_DAYS('2011-04-04') ),
  PARTITION p20110404 VALUES LESS THAN ( TO_DAYS('2011-04-05') ),
  PARTITION future    VALUES LESS THAN MAXVALUE
);

Which gives me the following error:

A PRIMARY KEY must include all columns in the table's partitioning function

So then, if I try to remove the PRIMARY KEY ( id ) USING BTREE row, I get:

Incorrect table definition; there can be only one auto column and it must be defined as a key

With (id, created) I get:

Incorrect parameter count in the call to native function 'TO_DAYS'

What am I missing?

Ryan H
  • 2,620
  • 4
  • 37
  • 109
  • Why are you still running MySQL 5.7 in 2022? – Dai Oct 07 '22 at 11:00
  • 1
    Removing the `PRIMARY KEY ( ID ) USING BTREE` line will **remove the primary-key entirely** - which you **do not** want to do.... _yikes_... – Dai Oct 07 '22 at 11:04
  • 1
    It wants you to change the `PRIMARY KEY` from `( id )` to `( id, created )` or `( created, id )`. – Dai Oct 07 '22 at 11:04
  • I noticed that your `CREATE TABLE` statement doesn't show the `created` column... – Dai Oct 07 '22 at 11:05
  • @Dai, I've added the `created` & `modified` now, along with your suggestion, this gives me the following error: `Incorrect parameter count in the call to native function 'TO_DAYS'`, I'll update the Stack overflow – Ryan H Oct 07 '22 at 11:10
  • 1
    You did not follow my instructions: I didn't say add `id` to `TO_DAYS`: I said to add it to your `PRIMARY KEY`. – Dai Oct 07 '22 at 11:12
  • 1
    As @Dai said: ```PRIMARY KEY (`id`,`created`) USING BTREE``` and ``` PARTITION BY RANGE( TO_DAYS(created) ) ```. You use ```created``` in your partitioning function, so it needs to be one of the columns that form the primary key. You don't need to use all PK columns in the partitioning functions, but all columns used must also be found in the PK – Hartmut Holzgraefe Oct 07 '22 at 11:18
  • 1
    @Dai, MySQL 5.7 is still valid in 2022. It isn't end-of-life until October 2023. – Bill Karwin Oct 07 '22 at 11:24
  • @BillKarwin That doesn't mean 5.7 users can keep their heads buried in the sand until Q42023. – Dai Oct 07 '22 at 11:25
  • Agreed, but nor should you mock people for using a supported product. – Bill Karwin Oct 07 '22 at 11:26
  • @Dai - The auto_inc column (id) must be the _first_ column in some index. That is, `PK(created, id)` won't suffice. – Rick James Oct 07 '22 at 18:14
  • @RyanH - Will you be using `DROP PARTITION` to purge "old" data? If not, what benefit are you hoping to get from Partitioning? See [_Partition_](http://mysql.rjweb.org/doc.php/partitionmaint) – Rick James Oct 07 '22 at 18:16

1 Answers1

2

TO_DAYS() takes only one argument, not two. So the following is an error:

PARTITION BY RANGE( TO_DAYS(id, created) )

Partitioning in MySQL is tricky, because of the rule described in this manual page: https://dev.mysql.com/doc/refman/5.7/en/partitioning-limitations-partitioning-keys-unique-keys.html

All columns used in the partitioning expression for a partitioned table must be part of every unique key that the table may have.

In other words, every unique key on the table must use every column in the table's partitioning expression. (This also includes the table's primary key, since it is by definition a unique key. This particular case is discussed later in this section.)

Read the page I linked to for more explanation and examples.

Honestly, this limitation prevents many sites from using partitioning on datetime columns. Either they need to make the datetime column their primary key, or else make a table with no primary key. Both of these choices are bad.

For what it's worth, partitioning isn't a magic "go fast" feature. It only helps improve performance for certain queries, if and only if the query has expressions that take advantage of partition pruning.

Using indexes is usually a much more flexible and effective way to optimize tables.

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • This makes sense now. What's the purpose of the `PARTITION p20110401 VALUES LESS THAN ( TO_DAYS('2011-04-02') )` rows, and does this need to always be in the past on some random date time? And how many do I need? I don't quite get these. – Ryan H Oct 07 '22 at 13:04
  • The post you linked to (https://stackoverflow.com/questions/6093585/how-to-partition-a-table-by-datetime-column) was written in 2011. The folks who were writing the examples back then used dates that were current at the time they wrote them, not in the past. – Bill Karwin Oct 07 '22 at 15:25
  • There's no universal rule about how many partitions you need. It depends on how big you want each partition to be. In the example you linked to, they wanted one day's worth of data per partition. – Bill Karwin Oct 07 '22 at 15:27