samedi 25 avril 2015

Trying to implement an Ordered Job Queue in Sql Server 2012

I would like to implement an Ordered Job Queue in Sql Server 2012. Here is the context:

  1. Many agents concurrently taking N jobs from that queue, where N may be different for every agent (depends on the load of the particular agent).
  2. The jobs should be taken in order. Suppose it is the primary key order (in reality it is slightly different). So, agents should prefer older jobs.
  3. Some jobs have restricted concurrency. For example, if jobs A and B belong to the same concurrency group, then it is OK to queue both of them, but it is forbidden to run both of them at the same time. So, one of them runs first (according to the previously defined order) and only then can the second job be run.

Please, have a look at the following SQL Fiddle - http://ift.tt/1QuJ5jp

Ideally, I would use a single UPDATE TOP (N) statement with the READPAST hint and the OUTPUT clause. However, UPDATE does not promise any order, hence I utilize a different approach:

  1. An ordered SELECT TOP (N) into a temp table with the UPDLOCK and READPAST hints. In effect, I reserve these records.
  2. A regular UPDATE with the OUTPUT clause.

But then, such a bulk update may fail, because a job in the bulk is concurrency restricted. In this case I fall back to iterating over the reserved records and try to mark them as running one by one, silently skipping those that fail the concurrency restriction.

Here is the complete query from the SQL Fiddle:

BEGIN TRAN

exec sp_executesql N'
DECLARE @Running TABLE (WorkItemId BIGINT)

SELECT TOP (@Count) WorkItemId INTO #Candidates 
FROM BackgroundJobWork WITH (UPDLOCK, READPAST) 
WHERE Status = 0 
ORDER BY WorkItemId

BEGIN TRY
    PRINT '' *** BATCH *** ''
    UPDATE BackgroundJobWork SET Status = 3
    OUTPUT inserted.WorkItemId INTO @Running
    FROM BackgroundJobWork 
    WHERE WorkItemId IN (SELECT WorkItemId FROM #Candidates)
END TRY
BEGIN CATCH
    PRINT '' *** ONE BY ONE *** ''
    DECLARE @WorkItemId BIGINT
    DECLARE c CURSOR FAST_FORWARD FOR 
    SELECT WorkItemId FROM #Candidates ORDER BY WorkItemId

    OPEN c
    FETCH NEXT FROM c INTO @WorkItemId

    WHILE @@FETCH_STATUS = 0
    BEGIN
        BEGIN TRY
            UPDATE BackgroundJobWork SET Status = 3
            OUTPUT inserted.WorkItemId INTO @Running
            FROM BackgroundJobWork 
            WHERE WorkItemId = @WorkItemId
        END TRY
        BEGIN CATCH
        END CATCH

        FETCH NEXT FROM c INTO @WorkItemId
    END

    CLOSE c
    DEALLOCATE c
END CATCH

SELECT * FROM @Running
',N'@Count int',@Count=6

ROLLBACK

(Rolling back for the testing purposes, Status 0 means Received, Status 3 means Running)

So, there are two cases:

  1. No concurrency restriction - bulk updates do not fail
  2. There is a concurrency restriction - falling back to updating one by one with a cursor

My goal is to test that two agents running this query at the same time will not interfere with each other, i.e. none will be locked out waiting for the other to finish.

I simulate the presence of concurrency restrictions by running the following query first:

UPDATE BackgroundJobWork 
SET ConcurrencyGroupName = CONVERT(NVARCHAR(2), CASE 
  WHEN WorkItemId % 2 = 0 THEN NULL 
  ELSE WorkItemId % 4 
END) WHERE Status < 100

This yields the following result:

SELECT WorkItemId,Status,ConcurrencyGroupName FROM BackgroundJobWork 
WHERE Status < 100 ORDER BY WorkItemId

WorkItemId  Status  ConcurrencyGroupName
1           0       1
2           0       NULL
3           0       3
4           0       NULL
5           0       1
6           0       NULL
7           0       3
8           0       NULL
9           0       1
10          0       NULL
11          0       3
12          0       NULL

As you can see:

  • Jobs 1,5,9 belong to the concurrency restriction 1
  • Jobs 3,7,11 belong to the concurrency restriction 3
  • Jobs 2,4,6,8,10,12 are not concurrency restricted

UPDATE BackgroundJobWork SET ConcurrencyGroupName = NULL WHERE Status < 100 removes all the concurrency restrictions.

Unfortunately, I do not know how to demo two agents in the SQL Fiddle. Here is how I do it in my SSMS:

  1. Two SQL script windows, each one containing the query.
  2. In the first window comment out the ROLLBACK statement.
  3. Run the SQL from the first window. Note, that the transaction is still open, i.e. all the locks are still in place.
  4. Now run the SQL from the second window.
  5. At the end rollback the transaction from the first window by executing the ROLLBACK statement.

Bulk updates work great - the second window is not locked out by the open transaction started in the first window. I can see the jobs 1,2,3,4,5,6 in the first window and 7,8,9,10,11,12 - in the second.

However, when I simulate the concurrency restrictions (using the aforementioned query) the second window is locked and awaits for the first one to release the locks.

I am greatly puzzled by it. After all, each window updates only the records it has previously reserved with the respective SELECT statements! These sets are disjoint - UPDLOCK and READPAST guarantee it.

Appending - the locks held by the query

I am checking what locks are held (in yet another SSMS window) using the following query:

DECLARE @locks TABLE (spid INT,
                      dbid INT,
                      ObjId BIGINT,
                      IndId INT,
                      Type NVARCHAR(10),
                      Resource NVARCHAR(128),
                      Mode NVARCHAR(10),
                      Status NVARCHAR(32))
INSERT INTO @locks EXECUTE sp_lock

SELECT spid, OBJECT_NAME(ObjId) ObjectName, i.name, l.Type, Mode, COUNT(1) Count
FROM @locks l
LEFT JOIN sys.indexes i ON i.index_id = l.IndId AND i.object_id = l.ObjId
WHERE Mode NOT IN ('S','IS') AND dbid = DB_ID('747_DFControl2')
GROUP BY spid,OBJECT_NAME(ObjId),l.Type,i.name,Mode
ORDER BY spid,OBJECT_NAME(ObjId),l.Type,i.name,Mode

(747_DFControl2 is the name of my database)

If I run it when there are no concurrency restrictions (i.e. bulk updates are successful) I get the following output:

spid    ObjectName          name                    Type    Mode    Count
60      BackgroundJobWork   IX_Status               KEY     X       12
60      BackgroundJobWork   PK_BackgroundJobWork    KEY     X       6
60      BackgroundJobWork   IX_Status               PAG     IX      1
60      BackgroundJobWork   PK_BackgroundJobWork    PAG     IX      1
60      BackgroundJobWork   NULL                    TAB     IX      1

Where spid 60 corresponds to the first window (the one with the open transaction). We do not see the second window - it was successfully rolled back.

And here is the result when the concurrency restrictions are enabled and the second window (spid 63) is waiting for the release of the locks:

spid    ObjectName          name                        Type    Mode    Count
60      BackgroundJobWork   IX_ConcurrencyRestriction   KEY     X       2
60      BackgroundJobWork   IX_Status                   KEY     X       12
60      BackgroundJobWork   PK_BackgroundJobWork        KEY     X       6
60      BackgroundJobWork   IX_ConcurrencyRestriction   PAG     IX      1
60      BackgroundJobWork   IX_Status                   PAG     IX      1
60      BackgroundJobWork   PK_BackgroundJobWork        PAG     IX      1
60      BackgroundJobWork   NULL                        TAB     IX      1
63      BackgroundJobWork   IX_ConcurrencyRestriction   KEY     X       1
63      BackgroundJobWork   IX_Status                   KEY     X       12
63      BackgroundJobWork   PK_BackgroundJobWork        KEY     X       6
63      BackgroundJobWork   IX_ConcurrencyRestriction   PAG     IX      1
63      BackgroundJobWork   IX_Status                   PAG     IX      1
63      BackgroundJobWork   PK_BackgroundJobWork        PAG     IX      1
63      BackgroundJobWork   NULL                        TAB     IX      1

That does not tell me much.

Can someone explain to me why the second instance of the query is locked out?

EDIT

From the question it is unclear why would the bulk update fail when I turn on the concurrency restriction. It is clear, though, from the SQL Fiddle - there is a conditional unique index on the BackgroundJobWork table:

CREATE UNIQUE NONCLUSTERED INDEX IX_ConcurrencyRestriction ON BackgroundJobWork (ConcurrencyGroupName) 
WHERE (Status=3 AND ConcurrencyGroupName IS NOT NULL)

Aucun commentaire:

Enregistrer un commentaire