2

I have an InnoDB table that may potentially get up to 5 million rows inserted per day. This table is required to have data from the last 12 months, so two or four times a year data older than 12 months is deleted.

Having this in mind, I can say that the table should never have more than 2 billion rows. That is why I set the primary key to be INT.

I knew back then that even when old data is being deleted, the primary key will still grow and grow reaching its limits after 3 or 4 years of data. However, I decided to use INT anyway, as using BIGINT would mean 4 bytes of extra data on each row, which is significant on a table with 2 billion rows (besides, the primary index is used as foreign key in other tables so these other tables would need to be changed to BIGINT too and increase their storage needs also). I also thought (most probably my biggest mistake) I will think of something later.

Well, that time has come, and I am thinking of doing a reordering of the primary key as here:

Reorder / reset auto increment primary key

or here

How to reorder a primary key?

But this is considered a 'bad practice' (plus it would take forever). The good practice is to change from INT to BIGINT (this would take forever too... but just once).

My question is, if I do this instead of reordering, will the INSERT or UPDATE or SELECT queries take longer as if having INT as primary key?

If it affects queries duration then it is really not an option to me, and the bad practice is the option... unless someone has a better solution?

Thanks!

Community
  • 1
  • 1
Roberto
  • 21
  • 3
  • possible duplicate of [BIGINT mysql performance compared to INT](http://stackoverflow.com/questions/9376610/bigint-mysql-performance-compared-to-int) – Simon at The Access Group Jul 07 '14 at 11:59
  • Have you used up all your negative numbers? Could you restart your autoincrement at -1 with a step of -1? – Michael Green Jul 07 '14 at 12:00
  • @MichaelGreen surely that's just putting the problem off for a bit longer + may result in breakages elsewhere if an ID >0 is expected – Simon at The Access Group Jul 07 '14 at 12:01
  • Yes, it is. But "a bit longer" is 3 to 4 years for this situation. Plus you're looking at getting the system up and running in, what, half an hour rather than, potentially, several days of down time for a 2 billion row table (and indexes) to rebuild. Is the application any more likely to barf on negative results than on 8 byte results? I don't know the answer to these or any number of other questions. But this has gotten some of my systems back on their feet quick smart. I throw it out there as an option for the OP. – Michael Green Jul 07 '14 at 12:09
  • Really the performance would degrade if you are having too many updates, delete in your system if they are being searched by Non clustered indexes. Just ensure your drop all the Indexes and update the table and then rebuild the Indexes. – lloydom Jul 07 '14 at 12:14
  • I don't see that you have any choice but to do it the right way and eat whatever performance penalty may occur. I think you're overestimating that. In any case surely it's better to pay a small online performance cost rather than have the database down for days restructuring it? However you could buy yourself more time by just making the int unsigned, to allow 4 billion values. – user207421 Jul 07 '14 at 12:31
  • They were defined as unsigned from the beginning. – Roberto Jul 07 '14 at 13:04
  • Actually I want to do all this while the system is running... Hopefully without messing up something. – Roberto Jul 07 '14 at 13:06

1 Answers1

1

The extra four bytes for the bigint may allow fewer rows per page which will slow down your DML. Worst case you go from two to one row per page; roughly double the query time. Best case there is enough unused space in each page already which the BIGINT expands into; zero effect.

Michael Green
  • 1,397
  • 1
  • 17
  • 25
  • How to know in which case I am? I am editing now this message to put more details about my database... – Roberto Jul 07 '14 at 13:34
  • How to know in which case I am? In case it is useful, my table has the primary key column (INT), 4 FLOAT columns, a TINYINT column, and two MEDIUMINT columns. Additionally, the MEDIUMINT columns are actually the primary keys of other two tables (foreign keys) and I have defined an index using both foreign keys together. Can you point me out some tutorial to understand how 'pages' are formed? I must say I am no computer expert. – Roberto Jul 07 '14 at 13:46