I would like to implement an Ordered Job Queue in Sql Server 2012. Here is the context:
- Many agents concurrently taking N jobs from that queue, where N may be different for every agent (depends on the load of the particular agent).
- The jobs should be taken in order. Suppose it is the primary key order (in reality it is slightly different). So, agents should prefer older jobs.
- Some jobs have restricted concurrency. For example, if jobs A and B belong to the same concurrency group, then it is OK to queue both of them, but it is forbidden to run both of them at the same time. So, one of them runs first (according to the previously defined order) and only then can the second job be run.
Please, have a look at the following SQL Fiddle - http://ift.tt/1QuJ5jp
Ideally, I would use a single UPDATE TOP (N) statement with the READPAST hint and the OUTPUT clause. However, UPDATE does not promise any order, hence I utilize a different approach:
- An ordered
SELECT TOP (N) into a temp table with the UPDLOCK and READPAST hints. In effect, I reserve these records.
- A regular
UPDATE with the OUTPUT clause.
But then, such a bulk update may fail, because a job in the bulk is concurrency restricted. In this case I fall back to iterating over the reserved records and try to mark them as running one by one, silently skipping those that fail the concurrency restriction.
Here is the complete query from the SQL Fiddle:
BEGIN TRAN
exec sp_executesql N'
DECLARE @Running TABLE (WorkItemId BIGINT)
SELECT TOP (@Count) WorkItemId INTO #Candidates
FROM BackgroundJobWork WITH (UPDLOCK, READPAST)
WHERE Status = 0
ORDER BY WorkItemId
BEGIN TRY
PRINT '' *** BATCH *** ''
UPDATE BackgroundJobWork SET Status = 3
OUTPUT inserted.WorkItemId INTO @Running
FROM BackgroundJobWork
WHERE WorkItemId IN (SELECT WorkItemId FROM #Candidates)
END TRY
BEGIN CATCH
PRINT '' *** ONE BY ONE *** ''
DECLARE @WorkItemId BIGINT
DECLARE c CURSOR FAST_FORWARD FOR
SELECT WorkItemId FROM #Candidates ORDER BY WorkItemId
OPEN c
FETCH NEXT FROM c INTO @WorkItemId
WHILE @@FETCH_STATUS = 0
BEGIN
BEGIN TRY
UPDATE BackgroundJobWork SET Status = 3
OUTPUT inserted.WorkItemId INTO @Running
FROM BackgroundJobWork
WHERE WorkItemId = @WorkItemId
END TRY
BEGIN CATCH
END CATCH
FETCH NEXT FROM c INTO @WorkItemId
END
CLOSE c
DEALLOCATE c
END CATCH
SELECT * FROM @Running
',N'@Count int',@Count=6
ROLLBACK
(Rolling back for the testing purposes, Status 0 means Received, Status 3 means Running)
So, there are two cases:
- No concurrency restriction - bulk updates do not fail
- There is a concurrency restriction - falling back to updating one by one with a cursor
My goal is to test that two agents running this query at the same time will not interfere with each other, i.e. none will be locked out waiting for the other to finish.
I simulate the presence of concurrency restrictions by running the following query first:
UPDATE BackgroundJobWork
SET ConcurrencyGroupName = CONVERT(NVARCHAR(2), CASE
WHEN WorkItemId % 2 = 0 THEN NULL
ELSE WorkItemId % 4
END) WHERE Status < 100
This yields the following result:
SELECT WorkItemId,Status,ConcurrencyGroupName FROM BackgroundJobWork
WHERE Status < 100 ORDER BY WorkItemId
WorkItemId Status ConcurrencyGroupName
1 0 1
2 0 NULL
3 0 3
4 0 NULL
5 0 1
6 0 NULL
7 0 3
8 0 NULL
9 0 1
10 0 NULL
11 0 3
12 0 NULL
As you can see:
- Jobs 1,5,9 belong to the concurrency restriction 1
- Jobs 3,7,11 belong to the concurrency restriction 3
- Jobs 2,4,6,8,10,12 are not concurrency restricted
UPDATE BackgroundJobWork SET ConcurrencyGroupName = NULL WHERE Status < 100 removes all the concurrency restrictions.
Unfortunately, I do not know how to demo two agents in the SQL Fiddle. Here is how I do it in my SSMS:
- Two SQL script windows, each one containing the query.
- In the first window comment out the
ROLLBACK statement.
- Run the SQL from the first window. Note, that the transaction is still open, i.e. all the locks are still in place.
- Now run the SQL from the second window.
- At the end rollback the transaction from the first window by executing the
ROLLBACK statement.
Bulk updates work great - the second window is not locked out by the open transaction started in the first window. I can see the jobs 1,2,3,4,5,6 in the first window and 7,8,9,10,11,12 - in the second.
However, when I simulate the concurrency restrictions (using the aforementioned query) the second window is locked and awaits for the first one to release the locks.
I am greatly puzzled by it. After all, each window updates only the records it has previously reserved with the respective SELECT statements! These sets are disjoint - UPDLOCK and READPAST guarantee it.
Appending - the locks held by the query
I am checking what locks are held (in yet another SSMS window) using the following query:
DECLARE @locks TABLE (spid INT,
dbid INT,
ObjId BIGINT,
IndId INT,
Type NVARCHAR(10),
Resource NVARCHAR(128),
Mode NVARCHAR(10),
Status NVARCHAR(32))
INSERT INTO @locks EXECUTE sp_lock
SELECT spid, OBJECT_NAME(ObjId) ObjectName, i.name, l.Type, Mode, COUNT(1) Count
FROM @locks l
LEFT JOIN sys.indexes i ON i.index_id = l.IndId AND i.object_id = l.ObjId
WHERE Mode NOT IN ('S','IS') AND dbid = DB_ID('747_DFControl2')
GROUP BY spid,OBJECT_NAME(ObjId),l.Type,i.name,Mode
ORDER BY spid,OBJECT_NAME(ObjId),l.Type,i.name,Mode
(747_DFControl2 is the name of my database)
If I run it when there are no concurrency restrictions (i.e. bulk updates are successful) I get the following output:
spid ObjectName name Type Mode Count
60 BackgroundJobWork IX_Status KEY X 12
60 BackgroundJobWork PK_BackgroundJobWork KEY X 6
60 BackgroundJobWork IX_Status PAG IX 1
60 BackgroundJobWork PK_BackgroundJobWork PAG IX 1
60 BackgroundJobWork NULL TAB IX 1
Where spid 60 corresponds to the first window (the one with the open transaction). We do not see the second window - it was successfully rolled back.
And here is the result when the concurrency restrictions are enabled and the second window (spid 63) is waiting for the release of the locks:
spid ObjectName name Type Mode Count
60 BackgroundJobWork IX_ConcurrencyRestriction KEY X 2
60 BackgroundJobWork IX_Status KEY X 12
60 BackgroundJobWork PK_BackgroundJobWork KEY X 6
60 BackgroundJobWork IX_ConcurrencyRestriction PAG IX 1
60 BackgroundJobWork IX_Status PAG IX 1
60 BackgroundJobWork PK_BackgroundJobWork PAG IX 1
60 BackgroundJobWork NULL TAB IX 1
63 BackgroundJobWork IX_ConcurrencyRestriction KEY X 1
63 BackgroundJobWork IX_Status KEY X 12
63 BackgroundJobWork PK_BackgroundJobWork KEY X 6
63 BackgroundJobWork IX_ConcurrencyRestriction PAG IX 1
63 BackgroundJobWork IX_Status PAG IX 1
63 BackgroundJobWork PK_BackgroundJobWork PAG IX 1
63 BackgroundJobWork NULL TAB IX 1
That does not tell me much.
Can someone explain to me why the second instance of the query is locked out?
EDIT
From the question it is unclear why would the bulk update fail when I turn on the concurrency restriction. It is clear, though, from the SQL Fiddle - there is a conditional unique index on the BackgroundJobWork table:
CREATE UNIQUE NONCLUSTERED INDEX IX_ConcurrencyRestriction ON BackgroundJobWork (ConcurrencyGroupName)
WHERE (Status=3 AND ConcurrencyGroupName IS NOT NULL)