1

I have a logging Db with a hard requirement for having a Primary Key be defined as such very specifically:

[Id] int IDENTITY(1,1) NOT NULL,

And no other requirements, no Foreign Keys, etc..

Records older than X will periodically be deleted.

At some point I will have to handle running out of key-space.

Is it possible to have an atomic operation that changes all the PKs to be contiguous starting at 0?

For example, given a table with these 2 records:

  • Pk = 100 ..other columns.,
  • Pk = 1234

The resulting table would have

  • Pk = 0 ..same original data, new PK..,
  • Pk = 1

Update 1: I am considering a solution such as this:

CREATE TRIGGER rollOutLogs ON Logs
AFTER INSERT
AS
BEGIN      
    if IDENT_CURRENT( 'Logs' ) >= 10
    DBCC CHECKIDENT ('[Logs]', RESEED, -10);      
    DELETE FROM Logs WHERE Id > IDENT_CURRENT( 'Logs' ) and Id < IDENT_CURRENT( 'Logs' ) + 5;
END
GO
ttugates
  • 5,818
  • 3
  • 44
  • 54
  • 1
    This could be done, but it would be a pointless operation. The whole point of surrogate keys is that they are supposed to be meaningless - it should not matter whether they key is 1, 2 or 456415654, as long as it uniquely identifies a row it serves its purpose. If you think, over the lifetime of your table you will end up with over 2 billion entries (including deleted items) and INT is too small, then use `BIGINT`, – GarethD Feb 16 '18 at 14:42
  • I have just read the question again, and seen that this is a logging table with no tables that reference it, which begs the question, why bother with an ID column at all? [Bad habits to kick : putting an IDENTITY column on every table](https://sqlblog.org/2010/02/08/bad-habits-to-kick-putting-an-identity-column-on-every-table) – GarethD Feb 16 '18 at 14:46
  • I envision a day where there may be more than 2.14 billion logs.. Not likely, but need to plan for it. The question specifically states requirement/constraint for using int.. I get it that there are definitely better ways to address.. But I am wanting to know if its even possible with this constraint. – ttugates Feb 16 '18 at 14:47
  • [This](https://github.com/serilog/serilog-sinks-mssqlserver#table-definition) is the driver for the requirement to have an int for a PK. And [this](https://github.com/serilog/serilog-sinks-mssqlserver/issues/69#issuecomment-366232620) is where I have already addressed issue with Library which is outside scope of this SO question.. – ttugates Feb 16 '18 at 14:49
  • When you get to 2 billion records, you could reseed the identity column to start again at -2 billion - then just hope that the next 4 billion events don't happen before you next archive the data. I can't see anything in that link that would suggest changing the ID to bigint would be a problem. – GarethD Feb 16 '18 at 14:52
  • Unless someone posts a functional reasonable solution, I only know of how to do this by switching between 2 tables.. At which point its easier for me to just dump that library.. Thats my fall back unless someone says, yes, run x cmd, followed by y. – ttugates Feb 16 '18 at 14:57
  • 2
    Switch to `BIGINT`? – Shnugo Feb 16 '18 at 14:58
  • How much logging downtime can you tolerate to: turn [`Identity_Insert`](https://learn.microsoft.com/en-us/sql/t-sql/statements/set-identity-insert-transact-sql) on, update `Id` values using `Row_Number`, reseed the identity column and turn `Identity_Insert` off? – HABO Feb 16 '18 at 15:51
  • I could tollerate Locks around 30 sec. But couldn't be completely down. I can improve my delete statement in the Update 1, and then never worry about downtime.. Just don't know how it would affect fragmentation, or something unknown.. I would improve the delete logic, its just proof of concept. – ttugates Feb 16 '18 at 15:56
  • @HABO Are you sure identity_insert on would allow you to change values? – paparazzo Feb 16 '18 at 16:00
  • @Paparazzi Judging from [this](https://stackoverflow.com/questions/19155775/how-to-update-identity-column-in-sql-server) exchange, no, an identity column _cannot_ be updated. Copies could be inserted with forced identity values, then the original rows deleted. That complicates things considerably and adds the fun of doing it twice: once to shuffle the existing data out of the way, e.g. to negative values, to ensure there is a contiguous range of values starting at 1 (or 0) to accommodate the survivors, and again to get the _nice_ values assigned. – HABO Feb 16 '18 at 16:07
  • @HABO It was a rhetorical question – paparazzo Feb 16 '18 at 16:10
  • Have you tested your proposed solution? – paparazzo Feb 16 '18 at 16:13
  • It works at that scale perfectly. But its 'clever' Which can be good and bad. – ttugates Feb 16 '18 at 16:13
  • How many log entries are you expecting per second? My thinking is that at a rate of 5 per second continuously it would take 5 years to exceed the integer limit. I would consider an ad hoc maintenance operation once every 5 years an acceptable trade off compared to firing that trigger 4 billion times. – GarethD Feb 17 '18 at 09:46
  • I ended up moving to Azure Table Storage. That said, this is my first production app, so I just don't know how much data it will generate.. I wanted the option to turn on lower logging levels on Production at times of need without worrying about key-space. – ttugates Feb 17 '18 at 12:17

0 Answers0