Note: Relocation requires a graceful shutdown. If a node crashes (kill -9, OOM, etc.),preShutdownandpersistPeerStateToPeersnever run, so peer state is not replicated and relocation does not occur. Actors and grains on a crashed node are lost.
When relocation happens
Relocation is triggered when:- A node leaves the cluster β The cluster membership layer (Hashicorp Memberlist) detects the departure and emits a
NodeLeftevent. - The departed node had relocatable actors or grains β Only actors spawned with relocation enabled (the default) and grains without
WithGrainDisableRelocationare eligible.
Relocation flow
Key steps
-
preShutdown β When a node stops, it builds a
PeerStatesnapshot of all relocatable actors and grains. Actors withWithRelocationDisabledand grains withWithGrainDisableRelocationare excluded. -
persistPeerStateToPeers β Before leaving membership, the node replicates its
PeerStateto up to 3 oldest cluster peers via RPC (not all remaining peers). The implementation usesselectOldestPeers(3)to pick the three oldest nodes byCreatedAt; if fewer than 3 peers exist, it replicates to all. Replication returns successfully as soon as 2-of-3 peers acknowledge (quorum); remaining RPCs are cancelled. Each peer stores the state in its local cluster store (e.g. BoltDB). The oldest peers are chosen because leadership is determined by node ageβthe oldest nodes are most likely to remain or become leader when the departing node leaves, so the new leader will have the state when it handlesNodeLeft. - cleanupCluster β The departing node removes its actors and grains from the cluster map (Olric) and, if leader, removes singleton kinds. This runs after persist and before leaving membership.
-
NodeLeft β The cluster emits a
NodeLeftevent. The leader node handles it: it fetches the departed nodeβsPeerStatefrom its local cluster store (the leader must have been one of the up-to-3 peers that received the state) and enqueues aRebalancemessage for the relocator. -
Relocator β A system actor receives
Rebalanceand:- Allocates actors: singletons go to the leader; non-singletons are distributed across leader + peers.
- Allocates grains: same distribution.
- For actors on the leader:
recreateLocallyremoves the actor from the cluster map, then spawns it locally (or viaSpawnSingletonfor singletons). - For actors on peers:
spawnRemoteActorremoves from cluster map, thenRemoteSpawnon the target peer. - For grains:
recreateGrainoractivateRemoteGrainon the target node.
- RebalanceComplete β The relocator tells the system guardian, which marks relocation done and deletes the departed nodeβs state from the cluster store.
Actor allocation
| Actor type | Destination |
|---|---|
| Singleton | Leader node only |
| Non-singleton | Distributed across leader + peers (chunked) |
WithRelocationDisabled are skipped. System actors (e.g. dead letter, scheduler) are never relocated.
Grain allocation
Grains are distributed similarly: remainder to the leader, then chunks to peers. Grains withWithGrainDisableRelocation are skipped.
Configuration
Disable relocation for an actor
UseWithRelocationDisabled when spawning:
- Node-local state or resources (e.g. local files, device handles)
- Actors that cannot be safely recreated without external state
Disable relocation for a grain
Disable relocation system-wide
UseWithoutRelocation() when creating the actor system:
preShutdownskips building and persisting peer state- No actors or grains are relocated when nodes leave
- Useful for development, testing, or when you manage placement yourself
Child actors
Child actors are not relocatable by default. When a parent spawns a child viaSpawnChild, the child gets withRelocationDisabled() implicitly. Children are tied to their parentβs lifecycle; they are not independently relocated.
If a parent is relocated, its children are not relocated with it. They are recreated only as part of the parentβs PreStart (or equivalent) on the new node, if the parent explicitly spawns them again.
Relocatability requirements
For an actor to be relocated successfully:- Actor type must be registered (reflection) so the relocator can instantiate it.
- Dependencies must implement
Dependency(serializable). Pass viaWithDependencies(dep). - Supervisor, passivation, reentrancy, stashing, role β these are encoded in the actorβs serialized form and restored on the target node.
During relocation
- The actor may temporarily not exist: it is removed from the cluster map before being recreated on the target node.
- Callers should retry or tolerate
ActorNotFound/ lookup failures during this window. - The actorβs logical identity (name, grain ID) is preserved; only the physical location (host:port) changes.
- Location transparency means you address by PID or grain ID; the framework routes to the new node once relocation completes.
Check relocatability
| API | Purpose |
|---|---|
pid.IsRelocatable() | Returns whether the actor may be relocated. |
pid.Path().HostPort() | Current host:port; changes after relocation. |
See also
- Clustered Mode β Cluster setup and discovery
- Singletons β Cluster singletons and placement
- Grains β Virtual actors and
WithGrainDisableRelocation - Extensions and Dependencies β Serializable dependencies for relocation
- Coordinated Shutdown β Shutdown sequence and
preShutdown