2

My main issue is that after we expanded CHAR(50) to CHAR(64) we started receiving timeouts on internal backup queries. The record size is a few kb and the database is very very big, so this column that is a primary key must be the reason for our trouble.

I searched through the internet but I found only selecting the type of the keys or comparison of CHAR vs VARCHAR but nothing about the optimal size. For example, is there some special optimization in MYSQL that for indices smaller than, let's say, 60 bytes it uses some for of caching while for larger it starts swapping stuff?

Any help would be appreciated. Even those suggesting there is no difference and simply the % of time spent on join was increased by % of the index size.

EDIT

THIS IS NOT THE ANSWER FITTING THE QUESTION however I have found out the reason our change got a HUGE performance hit.

We expanded column using

ALTER TABLE table MODIFY job_id CHAR(64);

This caused CHARACTER SET fall back to the default one = utf8mb4 dropping previous latin1

That would conclude my research, but I will leave this question opened for anyone that would be able to answer the impact on resizing key column.

This question does look for suggestion on type change.

Thank you all for their time and inputs!

Grzegorz
  • 3,207
  • 3
  • 20
  • 43
  • check this answer https://stackoverflow.com/questions/4642695/how-to-choose-between-uuids-autoincrement-sequence-keys-and-sequence-tables-for – Grumpy Feb 03 '21 at 17:11
  • I like INT for a PK – Strawberry Feb 03 '21 at 17:13
  • The data type should not matter. If you still use CHAR(50) but the number of rows grows by 28%, then the total size of the table will be similar, right? The real problem is that your backups are timing out. Are you using phpmyadmin to backup a "very very big" database? You might want to use a more professional tool intended to be used at larger scale. – Bill Karwin Feb 03 '21 at 17:17
  • @Grumpy That's not the case. We do have auto key, too, but this specific column key is used for various jobs and it has to be CHAR – Grzegorz Feb 03 '21 at 17:46
  • @Strawberry me to. In this case this id is not numerical. – Grzegorz Feb 03 '21 at 17:46
  • @BillKarwin It is an internal backup. – Grzegorz Feb 03 '21 at 17:47
  • 1
    I don't know what you mean by an internal backup. Can you describe what you did with more detail? What client are you using? Is this some SQL statement, or some backup software tool? Did it give an error message? If so, what was the message? In my experience, backup software doesn't time out. – Bill Karwin Feb 03 '21 at 18:22
  • 1
    Also can you be more specific instead of just saying "very very big?" Use some numbers and units. Typically I would see a database size measured in units of gigabytes. – Bill Karwin Feb 03 '21 at 18:24
  • Relational ids exist purely for the purposes of relating data. They need have no meaning beyond the internal scope of the database, so it's simply false to imply that a given PK cannot be numerical. – Strawberry Feb 03 '21 at 19:40
  • @Strawberry I did not claim that. I have said that OUR database does not have THIS column numeric. We suspect that expanding this column from 50 to 64 impacted performance and we look for the confirmation and possibly a fix (decreasing the size by 4 bytes, for example). Otherwise we will have to go trough vast set of places making sure the timeout is increased. Not a good stuff. – Grzegorz Feb 03 '21 at 22:54
  • @BillKarwin There is a deletion on the join of two tables having an average count 4 and 2 millions in a rolling window of two weeks. The join is left join, so nothing to improve here. – Grzegorz Feb 03 '21 at 23:02
  • Huge `DELETEs` are problematic for a lot of reasons. See if this helps any: http://mysql.rjweb.org/doc.php/deletebig – Rick James Feb 03 '21 at 23:23

1 Answers1

2

Short answer: There is no caching/swapping/optimal/etc size.

Long answer:

Don't use CHAR unless the data for that column really is fixed length -- such as country_code, postal_code, UUID, SSN, etc. Furthermore, use the minimal charset needed, such as ascii for those. CHAR wastes space by padding with spaces.

There is no such cutoff. There is no inherent problem in having a long PRIMARY KEY except that ...

  • Every secondary key has a copy of the PK. (This says the break-even is at 2 indexes; for more than 2, the extra bulk in secondary keys adds up for long PKs.)
  • Columns in other tables that need to JOIN to this PK (with or without an Foreign Key declared) will be bulkier than an INT.

Many users (or 3rd party software that generates SQL) blindly uses 8-byte BIGINT for ids. Even the 4-byte INT is usually overkill; see the smaller INT types.

Indexes are limited, but many things factor in:

  • 767 / 1000 / 3072 bytes depending on engine and version
  • Character set of char/varchar: CHAR/VARCHAR(50) may take 50 / 100 / 150 / 200 bytes, depending on charset.

InnoDB's buffer_pool is limited by innodb_buffer_pool_size, which should be set to something like 70% of RAM, does have caching. This implies that the bigger a table or index is, the more I/O is likely to be done.

Bottom line: Your timeouts are coming from other things. Consider increasing the timeout.

Also

When doing ALTER TABLE ... MODIFY COLUMN ..., you must specify all the characteristics of the column, specifically including the ones you are not changing. These include CHARACTER SET, COLLATION, [NOT] NULL, DEFAULT, etc.

I like to do SHOW CREATE TABLE to get the current definition for the columns, then copy the one I want to change into my fresh ALTER, modifying the one thing I am changing.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • There is something "sinister" going on. I looked through the logs and we went from almost 0 seconds to 45 minutes! I've seen that before we did ALTER `table` CHANGE COLUMN `job_id` `job_id` CHAR(50) CHARACTER SET 'latin1' NOT NULL DEFAULT '' The next was ALTER TABLE `table` MODIFY job_id CHAR(64); It seems that that ALTER TABLE removed CHARACTER SET! Now CHARACTER_SET_NAME is utf8mb4 and COLLATION_NAME is utf8mb4_general_ci COLUMN DEFAULT is set to while it is set not to be nullable! – Grzegorz Feb 04 '21 at 00:34
  • Thanks Rick. The problem is as described above and in my EDIT however your response is what I would look for! Thank you again. – Grzegorz Feb 04 '21 at 16:03
  • @Grzegorz - I added to my Answer to address the `ALTER` issue. – Rick James Feb 04 '21 at 19:29
  • This SHOW CREATE TABLE is an excellent idea! Thank you! – Grzegorz Feb 05 '21 at 20:58