8

Possible Duplicate:
Fragmentation of id's (auto_increment column) in mysql

I have this column in my database. Let's say its name is 'threadid'. It contains unique ids given to each thread for distinction.

threadid 9 8 7 6 5 4 3 2 1

Let's say I have deleted threads with id 5 and 6.

threadid 9 8 7 4 3 2 1

But when there is a submission after the deletion, the unique id given to that thread is 10. not 5. I think this is not neat.

How do I get the least possible value in the column? (in this case, 5.)

I think I can get the minimum value of the column using MIN() and keep +1 until I get the unused, unique value, but I think that's too complicated.

Is there any simple way to do this?

Thanks.

Community
  • 1
  • 1
Visualizer7
  • 325
  • 1
  • 5
  • 13
  • 2
    I know it doesn't really matter, but curious why you're wanting this? Just for keeping the numbers clean, or for something else? – jprofitt Mar 16 '12 at 19:08
  • @jprofitt I agree; in a `bigint(20)` column, having gaps in the ID values doesn't really matter (: –  Mar 16 '12 at 19:10
  • 6
    This question is popping few times a week. Bottom line is this: you don't want to "reuse" lost numbers. There are many reasons, it doesn't fit the comment to just list them all. If you want "pretty" ordered numbers, then don't use auto_increment for that. auto_increment has ONE and only ONE purpose - to *uniquely* identify a row. That's it. There's no "it's pretty" involved in there. If there must be a nice sequential numbering involved, create another column and maintain it via triggers. – N.B. Mar 16 '12 at 19:11
  • Related: http://stackoverflow.com/q/2106237/212218 –  Mar 16 '12 at 19:11
  • @ jprofitt @Phoenix Yeah I know. If I use bigint, me and my users will have to live eternally to reach the maximum value for the column.. I want to do this because it is cleaner and neater. Especially I want to convey messages to forum vandalisers that their flooding / spamming does not effect the database. – Visualizer7 Mar 16 '12 at 19:15
  • 6
    I disagree with who ever downvote this, just because an implementation is not the norm doesn't mean it has no usage, and in some cases, it must be done. I'm going to up vote this. – Churk Mar 16 '12 at 19:22
  • @Churk there are cases. But definitely not hte case of forum threads and not for the reason of just "neatness". Go figure. – Your Common Sense Mar 16 '12 at 19:35
  • 1
    @all who disagree. Like I stated, its a requirement for the implementation, therefore you must come up with a solution for it. And if you can come up with a different one then be my guess. This is suppose to be a question answer forum, you suppose to present answer to a question, not my option or what should be done. There is programmer.stackexchange.com for discussion of best practices. – Churk Mar 16 '12 at 19:46

6 Answers6

23

The maximum number of INT in MySQL 4294967295 if you are creating 1000 threads per minute you would need 1440000 ( 1000 x 60 x 24 ) id's per day. So you would run out of ids after about 8.17 years.

The maximum number of BIGINT is 18446744073709551615 and that would be enough to create 1000000000 (1 billion) ids per minute for 35 096 years.

So you should be fine with just wasting ids as much as you like and not worry about them.

Wolfgang
  • 4,865
  • 2
  • 29
  • 29
  • 1
    This sounds right. I guess I don't have to worry. I didn't see your answer. So I've done my own math and the result is the same. I don't have to worry. Let there be gaps. – Visualizer7 Mar 16 '12 at 19:50
  • Clear explanation, thanks. – Mahesh.D Jan 29 '15 at 06:51
  • @Wolfgang, what about an application where the user 'scrolls' through images which are referenced in the DB? If there are gaps does the cursor make failed attempts over the gaps till it encounters a valid id? – Vass Mar 28 '22 at 15:38
9

I agree with the rest of everyone where it is a very bad idea to implement your own find minimal open number. But at where I work, we are given a closed set of number and when a number is free up, we must reuse.

Here is how we did it.

We do not delete the row, but set all values of every column null, So what you do is SELECT min(id) WHERE columnA IS NULL, and if this returns something, we reuse, otherwise, insert a new row.

Churk
  • 4,556
  • 5
  • 22
  • 37
  • 1
    This is similar to a pattern known as Soft Delete. –  Mar 16 '12 at 19:23
  • @Phoenix, I am unfamiliar with this pattern, I am going to google is, but if you have any reference, greatly appreciated. – Churk Mar 16 '12 at 19:24
  • Unfortunately [there is no Wikipedia article for this topic](http://en.wikipedia.org/wiki/Wikipedia:Soft_deletion), but when I Googled it, I came across some informative articles. –  Mar 16 '12 at 19:25
  • Thanks churk I'll give it a try. And I marked your answer as accepted. – Visualizer7 Mar 16 '12 at 19:26
  • 6
    I see a lot of potential for either deadlocking (multiple concurrent connections trying to do an "insert") or just plain performance overhead by first having to check if there is an available empty row first, then insert later. You have just increased the complexity of your application significantly, for what, sequential numbers? Do yourself and the people who will inherit this application a huge favor and forget about this. Furthermore, if you use foreign keys, this could really create some horribly inconsistent data. – Devin Mar 16 '12 at 19:36
  • @Devin yeah that doesn't sound good. So I've done a little math... if there are 2 billion users who use my website daily and somehow all of them has turned into vandalizers, and they submit a thread every 5 seconds for a entire day (let's say they have no life and don't sleep), it's 17,280 submissions per user, and 34,560,000,000,000 submissions in total every day. And if I use bigint (unsigned), it will take 1,426 years for the 2 billion users to completely use up the column. If that's the case, I guess I can forget about it :) – Visualizer7 Mar 16 '12 at 19:45
  • 2
    A nice way to get a deadlock as @Devin already said. – arthurprs Mar 16 '12 at 19:45
  • 2
    I don't see a deadlock issue, and if you use an explicit table lock on inserts, you can prevent two sessions from getting the same number. This method is viable with only a mild performance penalty if you simply use `SELECT id WHERE columnA IS NULL LIMIT 1`, assuming an index on columnA. – Marcus Adams Mar 16 '12 at 21:49
  • 3
    This is long after the fact, but people, Churk said, **where I work we are given a closed set of number[s]".** He has a non-programming limitation forcing him to use a not-best-practice solution. You should feel blessed that you don't have non-programmers imposing such limitations. So, please stop complaining and give people with political/business limitations a break. The OP's qeustion was valid and Churk's answer was a clever solution. – JBH Aug 11 '17 at 20:01
  • Is this approach (or pattern) similar to setting a *flag*? Could the id have been set to zero instead of null? – Vass Mar 28 '22 at 16:56
  • @Vass, yes you can use any thing from an additional column as an indicator isDeleted, to putting a pattern of string you recognized as deleted. But setting an existing column to null use the least amount of resource, I have not seen too many other variation of implementation. – Churk Mar 22 '23 at 12:20
3

Do yourself a favor and forget about this. It is going to be a massive pain to implement, so unless there is a very compelling goal to achieve that cannot be approached otherwise (unlikely) we are talking massive pain for no gain.

MySql itself does not support this scenario for exactly this reason: massive complication for no benefit.

Jon
  • 428,835
  • 81
  • 738
  • 806
3

It contains unique ids given to each thread for distinction.

You said that.
Unique id. For distinction.

How can you call an id unique?
What distinction it could be when today some topic has one id and a next day - another?
How can you find it back?
How you going to link it to other threads?

Try to think of these words meaning. It is called unique for the reason.

Try to think not as a child who wants "neat" and "cute" things just for the pleasure but as an adult, who can see the consequences.

Try to learn to listen to more experienced people and follow their knowledge and experience based advises, not only your whims.

Your Common Sense
  • 156,878
  • 40
  • 214
  • 345
1

So, for referential integrity I don't recommend re-using ids. With the maximum limits on int and bigint you should never run out of unique values. Unique values are the core of referential databases.

With that said, you could run a search sub-routine (or instead of trigger) to find the lowest available id. Then momentarily turn off identity inserts, insert the data, then turn identity inserts back on.

There are so many reasons not to do this! Among which are all of the potential collisions between inserting data when more than one value are coming in at the same time.

mcfea
  • 1,129
  • 15
  • 22
  • Also, if you really want to keep a nice list of contiguous numbers, you might consider adding a "deactivated" flag. That way you aren't deleting anything up front. You could later reindex as a job. Again, this will blow your referential integrity, but you could do it... – mcfea Mar 16 '12 at 19:32
-2

MySQL will maintain the order of IDs unless the entire table is truncated. Deletion of rows does not affect the last ID.

Ben Ashton
  • 1,385
  • 10
  • 15