-3

I have a table that contains a varchar column which is indexed. The values in this column consist of a prefix and an incrementing number value. It is not necessary the values will be in order.

ABC00010
ABC00011
ABC00015
ABC00012
ABC00017
ABC00016
and so on...

There may be missing values in the sequence. How can I find the smallest number available for insert?

I wrote this and it works. But it takes a few seconds when the numbers are in the thousands.

Declare @C int = 1;
While Exists(Select 1 From MyTable Where Col='ABC'+Format(@C,'00000')) Set @C=@C+1;
Select 'Next Number: ABC'+Format(@C,'00000');

Is there a faster way?

navigator
  • 1,678
  • 16
  • 29
  • While asking a question, you need to provide a [minimal reproducible example](https://stackoverflow.com/help/minimal-reproducible-example): (1) DDL and sample data population, i.e. CREATE table(s) plus INSERT T-SQL statements. (2) What you need to do, i.e. logic and your code attempt implementation of it in T-SQL. (3) Desired output, based on the sample data in the #1 above. (4) Your SQL Server version (SELECT @@version;). – Yitzhak Khabinsky Nov 22 '22 at 18:08
  • Where do I create this? – navigator Nov 22 '22 at 18:17
  • One option: https://dbfiddle.uk/-SXv3GHO – Jonas Metzler Nov 22 '22 at 18:29
  • 1
    *Is there a faster way?* Yes absolutely - Fix your design - if your table was in at least 1NF you would not need to mess around with splitting strings or loops; a column should *always* contain a single atomic value. – Stu Nov 22 '22 at 18:34
  • In SQL Server Management Studio (SSMS) – Yitzhak Khabinsky Nov 22 '22 at 18:36
  • https://dbfiddle.uk/pHjqfOfO – navigator Nov 22 '22 at 18:39
  • @Stu, I fully agree. However, the actual use-case is a fairly complex and involves user-generated prefixes and a numbering system that may change on the fly. – navigator Nov 22 '22 at 18:49
  • @navigator then I would consider using some indexed computed columns which would allow the optimizer to properly aggregate and seek to the appropriate row and a set-based query will run in milliseconds. – Stu Nov 22 '22 at 18:52
  • [Dupe on DBA.StackExchange.com](https://dba.stackexchange.com/q/320081/150011). Please don't cross-post. – J.D. Nov 23 '22 at 01:07
  • 1
    Removed from DBA – navigator Nov 23 '22 at 04:13

5 Answers5

2

Always think set-based operations. A While loop is not set-based. Here I create a CTE with sequential numbers from 0 through 99999. Then I match the sequence CTE to the "MyTable", order the list, and select a TOP 1 from a RIGHT JOIN.

DECLARE @myTable TABLE (
    Col nvarchar(8)
);

INSERT INTO @myTable
VALUES ('ABC00001'), ('ABC00002'), ('ABC00005'), ('ABC00003')
;

WITH x AS (SELECT n FROM (VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) v(n))
, y as (
    SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) as number
    FROM x ones, x tens, x hundreds, x thousands, x tThousands
)
, yMod as (
    SELECT y.number
        , 'ABC' + RIGHT('00000' + CAST(y.number as nvarchar), 5) as Col
    FROM y 
)
SELECT TOP 1
    ym.Col
FROM @myTable as mt
    RIGHT OUTER JOIN yMod as ym
        ON ym.Col = mt.Col
WHERE mt.Col IS NULL
ORDER BY ym.number
Tim Jarosz
  • 1,133
  • 1
  • 8
  • 15
  • This is very interesting. I was trying to think of a set based approach but couldn't. My actual use-case is a bit more complex. But let me see if I can use this approach. Thanks! – navigator Nov 22 '22 at 18:43
2

You could adopt a Gaps-and-Islands approach.

The following are two approaches. The first will show just the next available value which the second will show all ranges

Made a quick modification to allow a variable Prefix ... @Prefix

Example

Declare @YourTable Table ([Col] varchar(50))  Insert Into @YourTable Values 
 ('ABC00010')
,('ABC00011')
,('ABC00015')
,('ABC00012')
,('ABC00017')
,('ABC00016')

Declare @Prefix varchar(25)='ABC'

-- Show Just the Next Value
Select distinct NextValue = min(@Prefix+right(concat('00000',try_convert(int,substring(max(Col),len(@Prefix)+1,25))+1),5)) over()
 From  (
        Select *
              ,Grp = try_convert(int,substring(Col,len(@Prefix)+1,25)) -row_number() over(order by Col)
         From @YourTable
         Where Col like @Prefix+'%'
       ) A     
 Group By Grp



-- Show All Next Values 
Select R1=min(Col)
      ,R2=max(Col)
      ,NextValue = left(max(Col),3)+right(concat('00000',try_convert(int,substring(max(Col),len(@Prefix)+1,25))+1),5)
 From  (
        Select *
              ,Grp = try_convert(int,substring(Col,len(@Prefix)+1,25)) -row_number() over(order by Col)
         From @YourTable
         Where Col like @Prefix+'%'
       ) A     
 Group By Grp

Results

enter image description here

John Cappelletti
  • 79,615
  • 7
  • 44
  • 66
1

Something like this?

Declare @testData table (Col varchar(20));
 Insert Into @testData (Col)
 Values ('ABC00001'), ('ABC00002'), ('ABC00012'), ('ABC00013')
      , ('XYZ00002'), ('XYZ00003'), ('XYZ00010'), ('XYX00012');

Declare @prefix char(3) = 'XYZ';

   With gaps
     As (
 Select *
      , grp = v.inc - row_number() Over(Order By v.inc)
   From @testData As td
  Cross Apply (Values (replace(td.Col, @prefix, ''))) As v(inc)
  Where td.Col Like @prefix + '%'
        )
 Select Top 1
        next_value = concat(@prefix, right(concat('00000', max(g.inc) + 1), 5))
   From gaps As g
  Group By
        g.grp
  Order By
        next_value;

Explanation:

  1. td.Col Like @prefix + '%': get only those that match
  2. Use CROSS APPLY to return just the numeric portion of the string.
  3. Use that value minus the row_number value to define each group.
  4. Get the max inc (number) + 1 as the next available number for each group
  5. Concat the prefix and zeroes to reform the next column value
  6. Get the TOP 1 (first) available next value
Jeff
  • 512
  • 2
  • 8
  • Can you explain what you are doing? – navigator Nov 23 '22 at 04:15
  • @navigator - this is a typical gaps & islands problem. In the CTE we define each 'group' of integers, which we get by parsing out the string portion of the column. Once we have each group - we can then get the max value + 1 for that integer and recombine them to form the 'next' available for that group. Then a simple TOP 1 to get the first value. – Jeff Nov 23 '22 at 17:20
1

Based on your comment from the dupe post that was on DBA.StackExchange:

It's for a specific prefix among other conditions on the table as well. Prefixes are always the same length.

That makes this a much simpler problem to solve, with some CTE magic. First filter the table with a starts-with WHERE clause for the prefix you care about, then use the SUBSTRING() function to trim out the prefix and order by the remaining numbers of the value. Finally, you can use the ROW_NUMBER() window function to find where the numerical value of the row doesn't match its ordinal, when sorted by that numerical value.

Example below:

WITH _ColNumbers AS
(
    SELECT
        CONVERT(INT, SUBSTRING(Col, 4, LEN(Col) - 3)) AS ColNumber, -- Strip out the prefix and convert the remaining numerical part to an integer 
    FROM MyTable
    WHERE Col LIKE 'ABC%'
),
_ColNumbersSorted AS
(
    SELECT 
        ColNumber,
        ROW_NUMBER() OVER (ORDER BY ColNumber) AS SortId -- Generate a sequential set of integers in the same order as ColNumber
    FROM _ColNumbers
),
_ColNumberLowestUnmatchedSortId AS
(
    SELECT TOP 1 -- Return the single lowest mismatch
        ColNumber,
        SortId
    FROM _ColNumbersSorted
    WHERE ColNumber <> SortId -- Filter down to only the rows that don't match their ColNumber to their SortId
    ORDER BY SortId -- Sort by the lowest SortId (aka lowest mismatch)
)

-- Join back to the list of all ColNumbers to get the row just before the lowest mismatch
-- The lowest available number is 1 more than the previous row's ColNumber
SELECT 
    CNS.ColNumber + 1 AS LowestAvailableNumber 
FROM _ColNumbersSorted AS CNS
INNER JOIN _ColNumberLowestUnmatchedSortId AS CNLUSI
    ON CNS.SortId = CNLUSI.SortId - 1;

At a quick glance, this may appear like a lot, but each CTE just represents each logical step for better readability. The performance should be pretty good, though one thing that could help performance is adding a computed column for CONVERT(INT, SUBSTRING(Col, 4, LEN(Col) - 3)) and creating an index on it, such that the calculation is already persisted (because it's not a SARGable expression). You can achieve that with the following code:

ALTER MyTable
ADD CONVERT(INT, SUBSTRING(Col, 4, LEN(Col) - 3)) AS ColNumber;

CREATE NONCLUSTERED INDEX IX_MyTable_Col_ColNumber ON MyTable (Col, ColNumber);

Not only would that help performance, doing so simplifies the code a little bit like so:

WITH _ColNumbersSorted AS
(
    SELECT 
        ColNumber,
        ROW_NUMBER() OVER (ORDER BY ColNumber) AS SortId -- Generate a sequential set of integers in the same order as ColNumber
    FROM MyTable
    WHERE Col LIKE 'ABC%'
),
_ColNumberLowestUnmatchedSortId AS
(
    SELECT TOP 1 -- Return the single lowest mismatch
        ColNumber,
        SortId
    FROM _ColNumbersSorted
    WHERE ColNumber <> SortId -- Filter down to only the rows that don't match their ColNumber to their SortId
    ORDER BY SortId -- Sort by the lowest SortId (aka lowest mismatch)
)

-- Join back to the list of all ColNumbers to get the row just before the lowest mismatch
-- The lowest available number is 1 more than the previous row's ColNumber
SELECT 
    CNS.ColNumber + 1 AS LowestAvailableNumber 
FROM _ColNumbersSorted AS CNS
INNER JOIN _ColNumberLowestUnmatchedSortId AS CNLUSI
    ON CNS.SortId = CNLUSI.SortId - 1;
J.D.
  • 954
  • 6
  • 22
  • 1
    Thanks. We did manage to solve the issue with a similar approach. – navigator Nov 28 '22 at 12:09
  • @navigator Cool, make sure you verify the performance is sufficient. I can see a couple of places in the solution that would be needed for this kind of problem, where the performance can go south if the query isn't tuned properly. – J.D. Nov 28 '22 at 14:31
0

A better option would be to not delete records but mark them as deleted. Use a BIT column as "deleted" flag for this. Then you can easily select the smallest deleted record and re-use it.

If no one was found, select the largest one and add 1 to the index. Or, even better, use a sequence to generate indexes.

-- delete
UPDATE mytable SET deleted = 1 WHERE id = @id;
-- find record to be reused.
SELECT MIN(Col) FROM mytable WHERE deleted = 1
Olivier Jacot-Descombes
  • 104,806
  • 13
  • 138
  • 188
  • Thanks, but the rows are not actually deleted. They are just with a different 'Status' column. Furthermore, there are several other elements in the where clause to determine the next sequence number. – navigator Nov 22 '22 at 19:01