2

We use uniqueidentifier for every record within a very large database. For business reasons we need to ensure that the uniqueidentifier is never used more than once, but for performance reasons we have a bigint as the primary key.

What is the fastest way to test the existence of a uniqueidentifer in a Sql server table?

Le-roy Staines
  • 2,037
  • 2
  • 22
  • 40
  • Eh? Why would you think you need to do this? Just use the *usual*, normal ways of *generating* uniqueidentifiers (`newid()` or `newsequentialid()`, if generating them within the database) and you're ["guaranteed" that they'll be unique](http://stackoverflow.com/questions/2977593/is-it-safe-to-assume-a-guid-will-always-be-unique) – Damien_The_Unbeliever May 13 '16 at 06:43
  • They are not guaranteed to be unique! We have data coming in from many third party systems that we don't trust how their generation of UUIDs is being done. Using UUID's as primary key and clustered index the performance degrades far more rapidly than this aproach, from what I have been able to achieve. With only a UUID and no other defining columns, the best I have achieved from 100 million rows is a lookup of 14,000ms. By adding the extra computed columns and including them in the primary key I can create a B-tree on a single column, and get it down to under a millisecond. – Le-roy Staines May 15 '16 at 22:07
  • We also need to guarantee uniqueness across multiple tables (including client-side databases that can continue to perform when there is no internet connection). Do you know of a faster way to achieve SELECT and INSERTS of UUID's, whilst maintaining integrity of uniqueness? – Le-roy Staines May 15 '16 at 22:12

2 Answers2

2

**

< 0.05ms to validate a uniqueidentifier from 100,000,000 rows on a single Standard S0 Sql Azure instance.

**

DISCLAIMER: The following aproach may require tweaking to your business requirements, and currently has only been run in a Sql Azure staging environment on an S0 instance (10 DTU's). The goal being for proof of concept.

(See below for boilerplate CREATE To scripts)

Method: Create a table specifically for storing any uniqueidentifiers that exist within your database. This table will be heavily optimised for the single purpose of augmenting our uniqueidentifer validator.

The table will have four columns. One for the uniqueidentifier, and the other three as single digit binary computed columns for storing each of the three digits of the uniqueidentifier.

We'll then create a clustered index on the first three digit columns, and the uniqueidentifer as the last column of the clustered index.

We'll lastly create a stored procedure that'll take a uniqueidentifer and break out the first three digits as binary datatype and perform the lookup, taking advantage of the B-tree structure of the data on the disk as per our clustered index.

Table CREATE To (with clustered index):

USE [MyDatabase]
GO

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

SET ARITHABORT ON
GO

SET ANSI_PADDING ON
GO

CREATE TABLE [dbo].[UniqueIds](
    [Guid] [uniqueidentifier] NOT NULL,
    [Char01]  AS (CONVERT([char](1),substring(CONVERT([char](36),[Guid]),(1),(1)))) PERSISTED NOT NULL,
    [Char02]  AS (CONVERT([char](1),substring(CONVERT([char](36),[Guid]),(2),(1)))) PERSISTED NOT NULL,
    [Char03]  AS (CONVERT([char](1),substring(CONVERT([char](36),[Guid]),(3),(1)))) PERSISTED NOT NULL,
 CONSTRAINT [PK_UniqueIds] PRIMARY KEY CLUSTERED 
(
    [Char01] ASC,
    [Char02] ASC,
    [Char03] ASC,
    [Guid] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
)

GO

SET ANSI_PADDING OFF
GO

Stored procedure for validation:

CREATE PROCEDURE [dbo].[UniqueIds.CountIds]
    @Guid uniqueidentifier,
    @IdCount bigint OUTPUT
AS
    -- SET NOCOUNT ON added to prevent extra result sets from
    -- interfering with SELECT statements.
    SET NOCOUNT ON;

    -- Break out uniqueidentifier down into the first three characters for indexed lookup.
    DECLARE @SubChar char(3) = CONVERT([char](36),@Guid);
    DECLARE @Char01 char(1) = @SubChar;
    DECLARE @Char02 char(1) = SUBSTRING(@SubChar,2,1);
    DECLARE @Char03 char(1) = RIGHT(@SubChar,1);

    -- Check if GUID already exists
    (SELECT TOP 1 @IdCount=1 FROM UniqueIds WHERE Char01=@Char01 AND Char02=@Char02 AND Char03=@Char03 AND [Guid]=@Guid);

Stored procedure for INSERTS:

CREATE PROCEDURE [dbo].[UniqueIds.Insert]
    @Guid uniqueidentifier
AS
BEGIN TRY
    BEGIN TRAN
    -- SET NOCOUNT ON added to prevent extra result sets from
    -- interfering with SELECT statements.
    SET NOCOUNT ON;

    -- Check if GUID already exists
    DECLARE @IdCount bigint;
    EXEC    [UniqueIds.CountIds]
            @Guid=@Guid,
            @IdCount = @IdCount OUTPUT

    IF @IdCount IS NULL
        INSERT INTO UniqueIds
        (
            [Guid]
        )
        VALUES
        (
            @Guid
        )
    ELSE
        THROW 60000, '[Guid] must be unique. Another unique identifier with the same signature exists.', 1;

    COMMIT

END TRY
BEGIN CATCH
    IF @@TRANCOUNT > 0
        ROLLBACK;
    THROW;
END CATCH

Inserting a new uniqueidentifier (example):

DECLARE @id uniqueidentifier
SET @id=NEWID()
EXEC    [UniqueIds.Insert]
        @Guid=@id

Final word on implementation:

Every time you insert a new uniqueidentifier anywhere in your database simply wrap it in a rollback transaction and call the stored procedure for inserting a uniqueidentifer. That will call our validation stored procedure and if it fails it'll throw an error. Your rollback will make sure nothing else is persisted.

Example:

CREATE PROCEDURE [dbo].[Bases.Insert]
    @Guid uniqueidentifier,
    @AccountId bigint=0,
    @ModifierId bigint=0,
    @ScopeIdentity bigint OUTPUT
AS
**BEGIN TRY
    BEGIN TRAN**
        -- SET NOCOUNT ON added to prevent extra result sets from
        -- interfering with SELECT statements.
        SET NOCOUNT ON

        **EXEC  [dbo].[UniqueIds.Insert]
        @Guid = @Guid;**

        -- Insert the base row
        INSERT INTO Bases
        (
            [Guid],
            [State],
            Utc,
            AccountId,
            ModifierId
        )
        VALUES
        (
            @Guid,
            0,
            GETUTCDATE(),
            @AccountId,
            @ModifierId
        );

    **COMMIT;**

    SELECT @ScopeIdentity = CAST(SCOPE_IDENTITY() As bigint);

**END TRY
BEGIN CATCH
    IF @@TRANCOUNT > 0
        ROLLBACK;
    THROW;
END CATCH**

**

Result: validating the uniqueness of a uniqueidentifier from 100 MILLION rows consistently takes < 0.05ms

**

Le-roy Staines
  • 2,037
  • 2
  • 22
  • 40
0

For me I changed the model property from my core project to ICollection<> and then I add migration and update database, and that's fixed my issue