0

What I need

Given a set of numbers from 1 to a limit n, get the smallest one which is not occupied by another record

Details

I was tasked to upload some customer information into a Database; it needs to support legacy features so I don't have control over it's structure.

There is a column "Code" which is INT, UNIQUE and already has a some data inserted into it, but its not continuous, so for example it can have the following records (1,2,3,5,7,125,200,500,127000)

I need to insert every new record with a number that is not already present but I don't want to leaving "holes" in the data.

I cannot simply SELECT the greatest one and add one, because the greatest one might be very close to the limit of the column while there is plenty available numbers in between. And I need to insert thousands of records.

So for example for the records in my example above, this function would return me "4" as "1", "2" and "3" are already used. Then If I insert a record using "4" it would then return "6" cause "5" was already there (so note that I cannot use the last inserted and add one cause the next one might be used as well)... and so on

What I've tried

I actually accomplished it using the following login (I write pseudocode to simplify as the query contains many columns which make the query really long)

-- for each record
DECLARE i = 0
WHILE i < limit
    IF (COUNT(*) FROM MyTable WHERE Code = i) = 0
        INSERT INTO MyTable (Code, ...) VALUES (i, ...)
        BREAK
    SET i = i+1;

This works, but make the query really long and its not very efficient as if I have 1000 insertions and records from 1 to 500 ill be looping 500000 times. Is there a better way to do this?

Something like INSERT INTO MyTable (Code, ...) VALUES (SMALLEST_AVALIABLE(Code), ...)

Thank you so much for your time!

Dale K
  • 25,246
  • 15
  • 42
  • 71
Daniel Cruz
  • 1,437
  • 3
  • 5
  • 19
  • 3
    https://stackoverflow.com/questions/174532/how-to-find-holes-in-a-table – Dale K Dec 07 '21 at 01:49
  • 1
    While @DaleK is the answer that you need, I'm thinking why do you want to fill holes in the database? It supposes you don't do that, and is a common behavior on databases. – Leandro Bardelli Dec 07 '21 at 02:07
  • The column I work with has a very limited set of numbers avaliable (somethimes arount 100) so if I want to insert 80 without using the wholes I wont be able to do it. Thanks @DaleK Ill take a look at the link you sent. – Daniel Cruz Dec 07 '21 at 02:19
  • In order to follow this request you will need to eliminate the basic idea of concurrency transactions (queries). You can for example lock the table from other transaction which might get the same value if they will execute at the same time, but you will have locks and waits in your database as a result of implementing this request. Moreover, since you are using UNIQUE on the column then if you will not implement synchrony use then you might get errors when you will have concurrency inserts. – Ronen Ariely Dec 07 '21 at 02:59
  • "`it needs to support legacy features so I don't have control over it's structure.`": This is the most common excuse and one which is the worse one in most cases - if you are in charge on the system, then take the control and if you are not in charge on the system, then pass the task to one that he is. Many times re-design the system is simply the right solution. – Ronen Ariely Dec 07 '21 at 03:08

1 Answers1

3

How to do it with a query:

with range as (
  select min(code) as min, max(code) as max from MyTable
), nums as (
    select min as x from range
    union all
    select x + 1 from nums 
    where x < (select max from range)
)
insert into MyTable (code)
select x from nums
left join MyTable on code = x
where code is null
option (maxrecursion 0)

See live demo.

Note: This is not that efficient, except for the coder.

Bohemian
  • 412,405
  • 93
  • 575
  • 722
  • It makes no sense in my opinion to create the number tables using loop (recursion) especially when you use it on-the-fly in each query. You should have a numbers table in the system and even if you want to create one on-the-fly then there are better options like select from sys.all_objects using ROW_NUMBR and other options – Ronen Ariely Dec 07 '21 at 02:55
  • @RonenAriely what makes no sense is just how far behind SQL Server is compared with open source databases like postgres, which provides for a trivial solution to this problem via [`generate_series()`](https://www.postgresql.org/docs/9.1/functions-srf.html) – Bohemian Dec 07 '21 at 04:12
  • Thanks, this will definaly do the job. I am not concerned about effitiency as the data is loaded once and then mostly only read. Thanks a lot man! – Daniel Cruz Dec 09 '21 at 03:03