ActorSystem owns one dispatcher. When an actor receives a message, it is scheduled onto a shared ready queue; a worker pulls it off, drains up to a configurable throughput budget, and yields back so other actors get a turn.
This is the Akka / Pekko / Erlang / Orleans pattern adapted for Go: the worker count is bounded by GOMAXPROCS, independent of the actor count. Actors become units of work on a scheduler rather than units that each own a goroutine β making runtime.NumGoroutine() stable at workerCount + O(1) no matter how many actors are active.
The public API is untouched: Tell, Ask, Spawn, Actor.Receive, PreStart, PostStop, mailboxes β all unchanged. The dispatcher lives below PID.doReceive and is an implementation detail of the actor package.
Components
| File | Responsibility |
|---|---|
actor/dispatcher.go | Worker pool, lifecycle (start / stop / signalStop), schedule entry point. |
actor/worker.go | Worker goroutine loop, local-queue reschedule helper. |
actor/ready_queue.go | Per-worker ring buffers, global overflow ring, work stealing, condvar parking. |
actor/dispatch_state.go | Three-state atomic machine (Idle / Scheduled / Processing) per actor. |
actor/pid.go | schedulable.runTurn implementation, mailbox split, doReceive enqueue path. |
The schedulable contract
The worker is agnostic to the actor state machine. ready_queue.go defines:
PID and grainPID both implement runTurn. This keeps the dispatcher a reusable scheduling primitive; any future schedulable (e.g. timers, stream stages) can plug in without touching the worker loop.
Actor state machine
EachPID carries a lock-free schedState with three values:
Invariant: at most one worker holds an actor in Processing. Enforced by the Scheduled β Processing CAS at the top of runTurn; this preserves the single-threaded-per-actor execution guarantee that user code relies on.
TrySchedule reads the state before attempting the CAS. Under N-way parallel Tell to the same actor, the hot cache line is read in shared mode by N-1 losers; an unconditional CAS would force request-for-ownership traffic on every call.
Enqueue path
PID.doReceive is the unified enqueue entry point:
isControlMessage routes PoisonPill, Panicking, Pause/ResumePassivation, PanicSignal, Terminated, and SendDeadletter to the system mailbox. All other messages go to the user mailbox. AsyncRequest / AsyncResponse are not control messages β they participate in the reentrancy stash and must keep FIFO order with user traffic.
The hot path is one atomic read, one CAS, one mailbox op, and β for the first producer that wins the Idle β Scheduled transition β one global-queue push with condvar signal.
Worker turn
The worker drains system messages first, then user messages, up to the throughput budget:dispatchOnereturns aretainedflag. The reentrancy-stash path holds on to theReceiveContextbeyond the turn, so the caller must not return it to the pool in that case.finishOrReclaimcloses the enqueue/finish race. After a drained dequeue, it storesIdle, re-reads both mailboxes, and β if a concurrentdoReceiveslipped a message in between the last dequeue and the state store β attemptsTrySchedulefollowed byTakeForProcessingto reclaim ownership within the current turn.
Scheduled and is re-pushed onto the owning workerβs local ring via worker.reschedule. Yielded actors land at the tail of the local ring, behind any other scheduled actor, guaranteeing forward progress for peers.
Ready queue
readyQueue combines per-worker local rings with a shared global ring and condvar parking. The take priority is: own local β global β steal from siblings β park.
| Layer | Purpose |
|---|---|
| Local rings | Each worker owns a mutex-guarded 256-slot ring. Push/pop FIFO from the ownerβs side. |
| Global ring | Shared amortised-FIFO ring (initial cap 64, doubles on overflow). Overflow + producer pushes. |
| Work stealing | When local and global are empty, rotate through siblings and steal half of the first non-empty victim. |
| Parking | When no work is found anywhere, park on readyQueue.cond. Producers signal only when parked > 0. |
globalCount atomic mirrors global.size so workers can skip the mutex on the fast path when the global ring is known empty. Sibling-steal probes use the victimβs own atomic size counter to avoid acquiring every siblingβs mutex during a scan.
stealHalf moves half the victimβs contents into the thiefβs ring under pointer-ordered locking (lockOrder) to avoid deadlock when two workers steal from each other simultaneously.
System-message priority
Control-plane messages cannot queue behind a user-message backlog. Every actor has two mailboxes:systemMailboxβ unbounded, holds control messages. Always consulted first inside the budget loop.mailboxβ the user mailbox, unchanged from the publicMailboxcontract.
PoisonPill delivered to an actor with 10,000 user messages queued shuts the actor down after at most one additional user message, not 10,000.
User-provided custom mailboxes continue to work: they sit in the mailbox slot and are not aware of the system mailbox.
Lifecycle
| Phase | Behaviour |
|---|---|
| Start | dispatcher.start() spawns worker goroutines. Idempotent. Must run before the first schedule. |
| Shutdown | dispatcher.stop() closes the ready queue and blocks on the worker WaitGroup. Not safe from within a worker turn. |
| Signal stop | signalStop() closes the queue and wakes all parked workers without blocking. Safe from inside a receive handler. |
| Restart | PID restart spins on schedState != Processing before reinitialising, since the MPSC mailbox is single-consumer. |
Observable guarantees
The dispatcher pool preserves every semantic that user code relies on:- Single-threaded actor execution. The
Scheduled β ProcessingCAS guarantees at most one worker holds an actor at a time. - FIFO within a producer. Messages
m1, m2, m3sent by the same producer to the same actor are dequeued in order. PreStart/PostStopcontract. Run exactly once per lifecycle, inside a turn on whichever worker happens to own the actor.- Reentrancy stash. Unchanged; reentrance happens within a single workerβs call stack for one actor.
- Panic recovery.
defer recover()insidedispatchOne, same as before. - Supervision. Supervisor directives run on the parentβs turn after the childβs
Panickingmessage reaches the parentβs system mailbox.
runtime.NumGoroutine(): it now returns workerCount + O(1) instead of scaling with active-actor count.
Tuning constants
| Constant | Value | Where |
|---|---|---|
| Worker pool size | max(GOMAXPROCS, 2) | dispatcherWorkerCount in actor/dispatcher.go |
| Throughput budget | 32 (default) | dispatcherThroughput in actor/dispatcher.go |
| Local-ring capacity | 256 | localQueueCap in actor/ready_queue.go |
| Global-ring initial cap | 64 (doubles on grow) | globalQueueInitialCap in actor/ready_queue.go |
Tuning the throughput budget
The per-turn message budget is the one knob exposed to users, viaWithThroughputBudget on NewActorSystem:
| Value | Profile |
|---|---|
8β16 | Latency-critical systems with many small actors (control planes, request routing). |
32 (default) | Balanced setting suitable for most mixed workloads. |
64 | Safe upgrade for throughput-oriented systems. Typically ~5β15% gain on message-heavy workloads. |
128 | Ingest and aggregation (log pipelines, event firehoses, batch processors). |
256+ | Synthetic benchmarks or pathological single-hot-actor workloads only. Throughput flattens and tail latency grows. |
References
- Akka Dispatchers
- Pekko Dispatchers
- Erlang/OTP scheduler
- Microsoft Orleans schedulers
- Tokio work-stealing scheduler
See also
- Architecture Overview β birdβs-eye view of the system
- Code Map β package layout and file-level responsibilities
- Design Decisions β rationale for architectural choices
- Full design document:
architecture/DISPATCHER_POOL_DESIGN.md