Skip to content

Atomic Claim Pattern

The atomic claim pattern is Monque’s core mechanism for ensuring a pending job is claimed by only one scheduler instance at a time.

This prevents concurrent duplicates, but it does not guarantee that a job can never run more than once. Jobs may be processed again after a crash, a retry, or stale recovery, so workers should be idempotent.

In distributed systems, multiple workers might try to pick up the same job simultaneously. Without proper coordination, this leads to:

  • Duplicate processing: Same job runs multiple times
  • Race conditions: Workers step on each other’s work
  • Lost updates: Results get overwritten
  • Wasted resources: Redundant computation

Monque uses MongoDB’s findOneAndUpdate with atomic guarantees to ensure only one scheduler claims each pending job.

// Simplified internal implementation
const job = await collection.findOneAndUpdate(
  {
    // Only match jobs that are:
    name: workerName,            // For this worker type
    status: 'pending',           // Not already claimed
    nextRunAt: { $lte: now },    // Ready to run
    $or: [                       // Not owned by another instance
      { claimedBy: null },
      { claimedBy: { $exists: false } }
    ]
  },
  {
    $set: {
      status: 'processing',
      claimedBy: schedulerInstanceId,  // Mark ownership
      lockedAt: new Date(),
      lastHeartbeat: new Date(),
      heartbeatInterval: 30000,
      updatedAt: new Date()
    }
  },
  {
    returnDocument: 'after'      // Return the claimed job
  }
);
  1. Atomic operation: MongoDB guarantees the query and update execute as one unit
  2. First-writer wins: Only one instance can match and update the same document
  3. Immediate visibility: Other instances see the updated document instantly
  4. No external locks: No need for Redis, ZooKeeper, or distributed lock managers

Each Monque instance has a unique identifier:

const monque = new Monque(db, {
  schedulerInstanceId: 'worker-1'  // Optional: provide your own
});

// Or let Monque generate a UUID
const monque = new Monque(db); // Uses randomUUID()
// Instance ID is visible in job documents
const job = await monque.enqueue('task', { foo: 'bar' });
// After being claimed:
// job.claimedBy === 'worker-1' (or the UUID)

Run multiple scheduler instances for high availability:

// Instance 1 (server-a)
const monque1 = new Monque(db, {
  schedulerInstanceId: 'server-a'
});
monque1.register('send-email', emailHandler);
monque1.start();

// Instance 2 (server-b)  
const monque2 = new Monque(db, {
  schedulerInstanceId: 'server-b'
});
monque2.register('send-email', emailHandler);
monque2.start();

// Both compete fairly for jobs
// Each pending job is claimed by exactly one instance at a time

Jobs are distributed naturally based on claim timing:

Job Queue: [A, B, C, D, E, F, G, H]

Instance 1 claims: A, C, E, G
Instance 2 claims: B, D, F, H
(Actual distribution varies based on timing)

Monque creates the required MongoDB indexes during initialize() to keep claim and polling queries fast.

For the full list of indexes, see Jobs.

// Compound index for claim queries
{ status: 1, nextRunAt: 1, claimedBy: 1 }

// Index for finding jobs by owner
{ claimedBy: 1, status: 1 }

These indexes ensure claim operations remain fast even with large queues.

sequenceDiagram
    participant W1 as Worker 1
    participant DB as MongoDB
    participant W2 as Worker 2
    
    Note over DB: Job in pending state
    
    W1->>DB: findOneAndUpdate (claim attempt)
    W2->>DB: findOneAndUpdate (claim attempt)
    
    DB-->>W1: Job document (claimed!)
    DB-->>W2: null (already claimed)
    
    Note over W1: Processes job
    
    W1->>DB: Update status to completed
    Note over DB: Job complete, claimedBy cleared

If a worker crashes while processing:

  1. Job remains in processing status with claimedBy set
  2. lastHeartbeat stops updating
  3. After lockTimeout, the job can be recovered on startup (see Heartbeat)

When stop() is called:

await monque.stop();
// 1. Stops accepting new jobs
// 2. Waits for in-progress jobs to complete
// 3. If shutdown times out, in-progress jobs may remain in `processing`
//    and will be recovered later by stale recovery on startup
// ✅ Good - Identifiable in logs and debugging
const monque = new Monque(db, {
  schedulerInstanceId: `${hostname}-${process.pid}`
});

// ❌ Less useful - Random UUID (default)
const monque = new Monque(db);
let claimCount = 0;

monque.on('job:start', () => {
  claimCount++;
  metrics.gauge('monque.active_claims', claimCount);
});

monque.on('job:complete', () => {
  claimCount--;
  metrics.gauge('monque.active_claims', claimCount);
});
monque.on('job:error', ({ error }) => {
  if (error.message.includes('claim')) {
    // Claim contention - normal in multi-instance
    metrics.increment('monque.claim_contention');
  }
});
PatternProsCons
Atomic Claim (Monque)No external dependencies, strong consistencyRequires MongoDB
Redis LocksFast, widely usedAdditional infrastructure, lock expiry issues
Pessimistic LockingSimple conceptBlocks other workers, deadlock risk
Optimistic LockingNo blockingRetry storms under contention