1

I have a database with records of M +2, the problem is: I had to delete about 10 000, and now I got this problem, the IDs will be missing, eg

ID: 100
ID: 102
ID: 103
...

How to organize this? That is, replace all subsequent IDs for no "holes" with this? Remember, I have more than 2 million records (15,000 more per day), manually is not feasible.

I Need:

ID: 101
ID: 102
ID: 103
...
JohnFx
  • 34,542
  • 18
  • 104
  • 162
  • 6
    This question is asked often. It's usually not a good idea to try to fill in the gaps, as it can break referential integrity elsewhere. The auto_increment IDs are not usually a significant column, so it doesn't actually matter if there are gaps. You shouldn't be using the id to see how many records you have, for example. – Michael Berkowski Jan 05 '12 at 15:24
  • 2
    What's the reason why you want to remove "holes"? Keep ids like they are. It doesn't cause any problem. – Nicola Cossu Jan 05 '12 at 15:24
  • I want them sequentially, there is no problem as references. –  Jan 05 '12 at 15:25
  • 4
    What nick is (correctly) asking is what that reason is. Probably because this question comes up a lot and the reason is typically just to suit someones OCD tendencies. So I'll ask it again. What BUSINESS reason do you have for making sure they are consecutive? – JohnFx Jan 05 '12 at 15:27

2 Answers2

4

Don't fill in the gaps. Renumbering existing rows is a really bad idea, especially if it is the PK field. For example think about the chaos it would cause if one day they decided social security numbers had to be consecutive so they shifted them all down when people died to fill in the gaps. So you'd get a frequently changing ID number that used to belong to someone else.

For a good surrogate key the actual value should have no meaning whatsoever, so consequently gaps have no meaning either.

If it is a major concern for you, why not just add an isDeleted bit field and flag records instead of actually deleting them. Voila, no gaps!

If you absolutely must have an incrementing number (like a row number) then just do it in your queries so you don't have to incur the trouble of maintaining it.

Community
  • 1
  • 1
JohnFx
  • 34,542
  • 18
  • 104
  • 162
  • Perhaps calling it `removed` as a boolean field is better than `isDelete` which doesn't make much sense. Using the inverse is also possible, such as having a `visible` flag which is true by default, false if "removed". – tadman Jan 05 '12 at 15:30
  • I think the theory is the same. It really doesn't matter what he names the field or whether true or false = deleted. – JohnFx Jan 05 '12 at 15:35
0

You may find that storing all of the records in a source table, then INSERT those into a secondary table once you're sure which ones qualify. This is usually pretty easy:

INSERT INTO b_table (a, b, c) SELECT a, b, c FROM a_table

If you have an auto-increment ID field you just leave it out of both sides and it will re-create it on the b_table side, sequentially, no gaps. You can purge records from a_table when you're done with them if you like.

tadman
  • 208,517
  • 23
  • 234
  • 262