0

So somehow on entering data with an AUTO_INCREMENT field called id I have found out that the data is in fact missing some rows. That is there will be rows like:

 ID   COL1   COL2
2000  data   data
2001  data   data
2003  data   data

My question is not how this skipping occurred (I will be looking into my code later to find that bug), but rather how to find the skipped rows and fix this. So the questions are:

1) What SQL command could I use to find the skipped IDs so I at least know where they are?

2) Is there any SQL voodoo I can use to move up all rows after a skip by one ID so that the missing ID is filled in? ie. move row 2003 to 2002 and 2004 to 2003 etc etc

EDIT:

This is not a question of how the gaps happened but how they can be fixed. I assure you there have been no deletions on this table only inserts by another program. I know there must be a a bug there. The question is how do I fix the data on my currently live system.

thatidiotguy
  • 8,701
  • 13
  • 60
  • 105
  • 1
    You're not using `AUTO_INCREMENT` right if you don't expect gaps. Gaps occur when you delete a record. If you want a sequence without gaps, add a non-`AUTO_INCREMENT` column for it, or better yet, use a counter when reading the rows. – Dark Falcon Jul 12 '13 at 17:24
  • 3
    The gaps aren't important. An auto-increment column is there to ensure _uniqueness_, not sequence. A deletion will cause a break. – Michael Berkowski Jul 12 '13 at 17:24
  • 2
    First of all, your IDs shouldn't really matter as long as they are unique. But, a solution to your problem of them not being perfectly sequential would be to loop through your entire dataset and update the IDs incrementally. – user1477388 Jul 12 '13 at 17:24
  • @user1477388 Thanks for the advice but I know that I could use a script to do this, I am interested in hearing pure SQL solutions. – thatidiotguy Jul 12 '13 at 17:26
  • 1
    @MichaelBerkowski Edited. The question is not about how it happened or why, it is how to fix it. – thatidiotguy Jul 12 '13 at 17:27
  • 1
    @DarkFalcon I understand how AUTO_INCREMENT works, that is not the question here. – thatidiotguy Jul 12 '13 at 17:27
  • 3
    @thatidiotguy What we're all concerned about here is _why_ you want to fix it. We don't see your code, so we cannot speculate on any bugs. If you think one exists, post the appropriate code. Given only what you've shown, you'll only get responses not to worry about non-sequential auto-increment values. – Michael Berkowski Jul 12 '13 at 17:30
  • @MichaelBerkowski I have an email sending program that goes through the rows sequentially to get email address and name. Right now, it breaks every time there is a gap. The table is meant purely as a way of holding a list of recipients and marking that they have had their email sent to them. The program is currently running on this data, so I would like to fix the IDs to be sequential yet again. – thatidiotguy Jul 12 '13 at 17:31
  • 3
    @thatidiotguy `it breaks every time there is a gap` The reason it breaks isn't because of the gaps, it breaks because you're using it wrong and trying to get it to do something that it was NEVER intended to do. That's what **everyone** has tried to tell you. – mawburn Jul 12 '13 at 17:32
  • 1
    @Oberon What is "it"? What am I using wrong? – thatidiotguy Jul 12 '13 at 17:34
  • 1
    @thatidiotguy An AUTO_INCREMENT ID column. Like others have said, it's there to ensure uniqueness **through** counting, not for giving a sequential count. You're misunderstanding it's purpose. If you want it to be counted accurately, then you must find the current count yourself and set a separate column to an appropriate sequential number. – mawburn Jul 12 '13 at 17:36
  • 1
    @Oberon As I stated, I know there is a bug in the way it works. I did not write the code that caused this. I am simply trying to fix the production system. – thatidiotguy Jul 12 '13 at 17:47
  • 1
    No, this is not the right fix. Gaps will happen again. And there is nothing to do to prevent it. The bug exists in the application, it need to be fixed. You are trying to address a symptom, not the root cause. – RandomSeed Jul 12 '13 at 17:48
  • 1
    @RandomSeed Everyone is obsessed with the application being bugged. I am aware of this. I have filed a bug report and everything. As I have stated over and over, my questions is not how it happened, but methods for fixing it now. – thatidiotguy Jul 12 '13 at 17:59
  • @RandomSeed: sometimes a doctor does need to treat the current symptoms, as well as attending to a cure for the underlying condition. The end users don't really care why the system isn't working, they just want it working. And if resetting id values is a short term "fix" to getting the system working for the end users, sometimes that's just what the doctor ordered. – spencer7593 Jul 12 '13 at 18:02

4 Answers4

3

There's several reasons an auto_increment column can have "skipped" values... a failed insert, an INSERT IGNORE ..., an INSERT ... ON DUPLICATE KEY, a value specified in an insert that is higher than the next value, rows inserted and subsequently deleted, an ALTER TABLE ... AUTO_INCREMENT = statement, there's all sorts of reasons.

What AUTO_INCREMENT guarantees you is a unique value; it doesn't guarantee that none will be skipped.

There is usually no need to modify id values, and it can cause some significant problems for users and applications, if they are (reasonably) expecting id values to be immutable. In terms of the database, be aware of foreign keys (either enforced by InnoDB, or implied and not enforced as with MyISAM), or triggers that may fire, etc.

To set new values for the ID column, so that there are no "skipped" values (assuming no foreign key constraints will be violated)

in this example, starting with id values following id 2001, such that next higher id value (2003 in your example) will be set to 2002, the next higher id value after that, will be set to 2003, and so on...

UPDATE mytable t 
  JOIN ( SELECT s.id
              , @new_id := @new_id + 1 AS new_id
           FROM ( SELECT @new_id := 2001 ) i
           JOIN ( SELECT r.id
                   FROM mytable r
                  WHERE r.id > 2001
                  ORDER BY r.id
                ) s
          ORDER BY s.id
       ) u
    ON t.id = u.id
   SET t.id = u.new_id

SQL Fiddle Here

Again, all the warnings that "skipped" values are not a problem, and resetting id values can cause big problems (as mentioned above) apply.

To reset the AUTO_INCREMENT value for the table, it's a simple ALTER TABLE statement. If you attempt to set it lower than the maximum id value, MySQL uses the maximum id value. So, you can just

ALTER TABLE mytable AUTO_INCREMENT = 1;

This statement does not change existing rows, it just sets the auto increment value to the lowest value, such that the next auto increment retrieved will be one higher than maximum id value currently in the table.


To answer you first question, it's harder to get a list of "skipped" id values, since there isn't a row source for them.

If we just want to check if any id values are "skipped", we can make use of the query in the inline view aliased as u (from the UPDATE statement above). We change that UPDATE into a SELECT, and just filter out rows where the id value matches the (generate) new_id value:

SELECT u.*
  FROM ( SELECT s.id
              , @new_id := @new_id + 1 AS new_id
           FROM ( SELECT @new_id := 2001 ) i
           JOIN ( SELECT r.id
                   FROM mytable r
                  WHERE r.id > 2001
                  ORDER BY r.id
                ) s
          ORDER BY s.id
       ) u
WHERE u.id <> u.new_id
ORDER BY u.id

If that query returns no rows, then there are no "skipped" values.

spencer7593
  • 106,611
  • 15
  • 112
  • 140
  • Thank you this is the information I was looking for. I am aware of the bug in the program, but was in need of fixes. Thank you. – thatidiotguy Jul 12 '13 at 17:46
  • I didn't answer your first question, how to identify the "missing" id values. I do have a query that can identify the id value after a "break", i.e. there is at least one id value "missing" before it. You can run just the query from the inline view aliased as u... – spencer7593 Jul 12 '13 at 17:53
  • @thatidiotguy: i prefaced my answer with all that information (which you didn't ask for) for the benefit of future readers. Someone else looking for an answer to this question may need to be aware of the danger of changing id values which are expected to be immutable. – spencer7593 Jul 12 '13 at 18:07
  • No worries, I know I was getting a little frustrated there but I totally understand why everyone was discussing the cause. Thanks for understanding that for the minute a quick fix was needed, and I just did not have the SQL Guru-ness to do it. Thanks again. – thatidiotguy Jul 12 '13 at 18:16
2

Based on your comment to my comment, a solution would be to simply reset the auto increment column like so: How to reset AUTO_INCREMENT in MySQL?

ALTER TABLE tablename AUTO_INCREMENT = 1
Community
  • 1
  • 1
user1477388
  • 20,790
  • 32
  • 144
  • 264
  • This is not what I want to do. The data on this table will never change again and there will not be another insert into it. – thatidiotguy Jul 12 '13 at 17:30
  • 1
    @thatidiotguy You don't understand... The code I provided you will reset the entire existing dataset starting at one giving you a perfectly sequential column... It's not just for new rows that get added. – user1477388 Jul 12 '13 at 17:31
  • Ah apologies. I did not read completely. I would prefer to not reset every id as I currently have a job currently iterating over this data by ID. – thatidiotguy Jul 12 '13 at 17:33
  • 1
    If you don't want to reset the existing IDs nor add new records to fill in the gaps then what is your intention? – user1477388 Jul 12 '13 at 17:42
1

As others have tried and failed to tell you, gaps in an auto-increment column is not a bug, it is normal behaviour (it can happen with something as trivial as a rolled back insertion). If your application chokes on gaps, then the application is bugged.

You must not try to make your ID's sequential, you must focus on having your application handle these gaps correctly.

If you absolutely need these values be sequential, then you should not use an auto-increment column at all (or add a hand-crafted order column, perhaps).

RandomSeed
  • 29,301
  • 6
  • 52
  • 87
0

For Question 1:

Hmm.. you could do a simple query to count rows of the id. Something like:

SELECT COUNT(id) FROM your_table WHERE id BETWEEN 2000 AND 2003;

From that query, the result should return 4.

For Question 2:

A simple alter table auto increment would do.