i have table that processed concurrently by N threads.
CREATE TABLE [dbo].[Jobs]
(
[Id] BIGINT NOT NULL CONSTRAINT [PK_Jobs] PRIMARY KEY IDENTITY,
[Data] VARBINARY(MAX) NOT NULL,
[CreationTimestamp] DATETIME2(7) NOT NULL,
[Type] INT NOT NULL,
[ModificationTimestamp] DATETIME2(7) NOT NULL,
[State] INT NOT NULL,
[RowVersion] ROWVERSION NOT NULL,
[Activity] INT NULL,
[Parent_Id] BIGINT NULL
)
GO
CREATE NONCLUSTERED INDEX [IX_Jobs_Type_State_RowVersion] ON [dbo].[Jobs]([Type], [State], [RowVersion] ASC) WHERE ([State] <> 100)
GO
CREATE NONCLUSTERED INDEX [IX_Jobs_Parent_Id_State] ON [dbo].[Jobs]([Parent_Id], [State] ASC)
GO
Job is adding to table with State=0 (New)
— it can be consumed by any worker in this state. When worker gets this queue item, State
changed to 50 (Processing)
and job becomes unavailable for other consumers (workers call [dbo].[Jobs_GetFirstByType]
with arguments: Type=any, @CurrentState=0, @NewState=50
).
CREATE PROCEDURE [dbo].[Jobs_GetFirstByType]
@Type INT,
@CurrentState INT,
@NewState INT
AS
BEGIN
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
DECLARE @JobId BIGINT;
BEGIN TRAN
SELECT TOP(1)
@JobId = Id
FROM [dbo].[Jobs] WITH (UPDLOCK, READPAST)
WHERE [Type] = @Type AND [State] = @CurrentState
ORDER BY [RowVersion];
UPDATE [dbo].[Jobs]
SET [State] = @NewState,
[ModificationTimestamp] = SYSUTCDATETIME()
OUTPUT INSERTED.[Id]
,INSERTED.[RowVersion]
,INSERTED.[Data]
,INSERTED.[Type]
,INSERTED.[State]
,INSERTED.[Activity]
WHERE [Id] = @JobId;
COMMIT TRAN
END
After processing, job State
can be changed to 0 (New)
again or it can be once set to 100 (Completed)
.
CREATE PROCEDURE [dbo].[Jobs_UpdateStatus]
@Id BIGINT,
@State INT,
@Activity INT
AS
BEGIN
UPDATE j
SET j.[State] = @State,
j.[Activity] = @Activity,
j.[ModificationTimestamp] = SYSUTCDATETIME()
OUTPUT INSERTED.[Id], INSERTED.[RowVersion]
FROM [dbo].[Jobs] j
WHERE j.[Id] = @Id;
END
Jobs has hierarchical structure, parent job gets State=100 (Completed)
only when all childs are completed.
Some workers call stored procedures ([dbo].[Jobs_GetCountWithExcludedState]
with @ExcludedState=100
) that returns number of incompleted jobs, when it returns 0, parent job State
can be set to 100 (Completed)
.
CREATE PROCEDURE [dbo].[Jobs_GetCountWithExcludedState]
@ParentId INT,
@ExcludedState INT
AS
BEGIN
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
SELECT COUNT(1)
FROM [dbo].[Jobs]
WHERE [Parent_Id] = @ParentId
AND [State] <> @ExcludedState
END
The main problem is strange behaviour of this stored procedure. Sometimes it returns 0 for parent job, but it exactly has incompleted jobs. I tryied turn on change data tracking and some debug information (including profiling) — child jobs 100% doesn't have State=100
when SP return 0.
It seems that the SP skips records, that are not in 100 (Completed)
state, but why it happen and how we can prevent this?
UPD:
Calling [dbo].[Jobs_GetCountWithExcludedState]
starts when parent job has childs. There сan be no situation when worker starts checking child jobs without their existence, because creating childs and setting to parent job checking activity wrapped in transaction:
using (var ts = new TransactionScope())
{
_jobManager.AddChilds(parentJob);
parentJob.State = 0;
parentJob.Activity = 30; // in this activity worker starts checking child jobs
ts.Complete();
}