Hot Code Update

Replace running components with new versions — state preserved, zero downtime.

The Problem

You deploy a new version of a service. The old version has state: an in-memory index, a connection pool, a query counter. Traditional approaches:

  • Restart the process — lose all state, brief downtime.
  • Blue-green deployment — spin up a new process, drain traffic, shut down old. Works but requires infrastructure orchestration.
  • In-place module reloadimportlib.reload() replaces the class, but existing instances still reference the old class. State is orphaned.

SignalPy’s hot_update() is an in-process blue-green: snapshot state, swap the class, restore state. The kernel handles the lifecycle. No orchestration needed.

The only pattern that required kernel changes

All other patterns worked with the stock kernel. Hot update required adding:

  • kernel.hot_update(new_cls) method
  • @lifecycle.snapshot and @lifecycle.restore hooks
  • LifecycleManager.replace_factory() and remove_instance()

Architecture

The hot update sequence:

V1 running (index=3 docs, queries=1)
    │
    ├── 1. Snapshot: call @lifecycle.snapshot → {"index": {...}, "query_count": 1}
    ├── 2. Tear down: unregister bus, unprovide, deactivate
    ├── 3. Replace factory: SearchV1 class → SearchV2 class
    ├── 4. Create + activate: new instance, same name, same properties
    └── 5. Restore: call @lifecycle.restore(snapshot_data)

V2 running (index=3 docs, queries=1, new search algorithm)

How It Works

The snapshot/restore hooks

Components declare what state to preserve:

@component("search", version="1.0", depends=["config"])
@requires(config="IConfig")
@provides("ISearch")
class SearchV1:

    @lifecycle.activate
    def activate(self):
        self._index = {}
        self._query_count = 0

1    @lifecycle.snapshot
    def snapshot(self):
        return {
            "index": dict(self._index),
            "query_count": self._query_count,
        }

2    @lifecycle.restore
    def restore(self, state):
        self._index = state.get("index", {})
        self._query_count = state.get("query_count", 0)
1
Called before teardown. Returns a dict of state to preserve.
2
Called after activation of the new version. Receives the preserved dict.

Both hooks are optional. Without them, hot_update falls back to copying instance.__dict__ (excluding rt and kernel internals).

The kernel method

new_instances = await kernel.hot_update(SearchV2)

This does five things atomically (per instance):

  1. Snapshot@lifecycle.snapshot or __dict__ fallback
  2. Tear down — unregister bus handlers, unprovide services, deactivate, remove
  3. Replace factory — swap the class in the factory registry
  4. Create + activate — new instance, same name, same properties, full lifecycle
  5. Restore@lifecycle.restore or __dict__ merge

All instances of the factory are updated. If you have L3 targeted instances (search-prod, search-dev), all of them get the new class.

The new version

V2 has the same component name, same runnables, same snapshot/restore — but a different search algorithm:

@component("search", version="2.0", depends=["config"])
@requires(config="IConfig")
@provides("ISearch")
class SearchV2:
    """Adds relevance scoring and prefix matching."""

    @runnable("search", params=SearchParams, description="Search (v2)")
    async def search(self, params):
        self._query_count += 1
        query = params.query.lower()
        results = []
        for doc_id, text in self._index.items():
            text_lower = text.lower()
            if query in text_lower:
                score = 1.0
                if text_lower.startswith(query):
                    score += 0.5                # prefix bonus
                if query == text_lower:
                    score += 1.0                # exact match bonus
                results.append({"id": doc_id, "text": text, "score": score})
        results.sort(key=lambda r: r["score"], reverse=True)
        return {"engine": "v2-scored", "results": results, ...}

Running

PYTHONPATH=src python -m signalpy.examples.hot_update

Expected output:

V1 search 'python': 2 results, engine=v1-keyword
Queries so far: 1

Hot update: V1 → V2
Updated 1 instance(s)
Status after update: version=2.0, docs=3, queries=1

V2 search 'python': 2 results, engine=v2-scored
  1: Python programming language (score=1.5)
  2: Python snake species (score=1.5)
Queries after update: 2

The index (3 docs) and query count (1) survived the upgrade.

Tests: TestHotUpdate in src/signalpy/tests/test_examples.py.

The Real Flow: PluginLoader + File on Disk

The simple kernel.hot_update(SearchV2) example assumes you already have the class in memory. In production, the new code arrives as a .py file — deployed to a plugins directory, pulled from a package registry, or pushed via CI.

The PluginLoader provider (src/signalpy/providers/plugin_loader.py) handles the full flow: scan directory → importlib import → find @component classes → hot_add (new) or hot_update (existing factory name).

The self-contained demo at src/signalpy/examples/hot_update_demo/ shows it end-to-end:

hot_update_demo/
├── __main__.py       # boots kernel, copies files, triggers scans
├── search_v1.py      # V1 component (deployed first)
├── search_v2.py      # V2 component (overwrites V1 to trigger hot_update)
└── plugins/          # temp directory watched by PluginLoader

The sequence:

# 1. Boot kernel with PluginLoader
kernel.instantiate("plugin-loader", properties={
    "plugin_dir": str(plugin_dir),
    "kernel": kernel,
})
await kernel.boot()

# 2. Deploy V1: copy search_v1.py into plugins/
shutil.copy2(SEARCH_V1, plugin_dir / "search.py")
await kernel.bus.invoke("plugin-loader.scan", {})
# → scan finds search.py, imports it, hot_adds SearchV1

# 3. Index data, run queries (state accumulates in V1)

# 4. Deploy V2: overwrite search.py with the new version
shutil.copy2(SEARCH_V2, plugin_dir / "search.py")
await kernel.bus.invoke("plugin-loader.scan", {})
# → scan re-imports search.py, sees factory "search" already loaded
# → calls kernel.hot_update() → snapshot → teardown → replace → restore

Run it:

PYTHONPATH=src python -m signalpy.examples.hot_update_demo

The PluginLoader itself is a component — you can hot_remove it when you don’t need dynamic loading, or replace it with a version that polls automatically or watches via watchdog.

Production Considerations

Schema migration. V2 might expect different state keys than V1. The @lifecycle.restore method should handle missing keys gracefully with defaults.

Incompatible contracts. If V2 changes its @provides or @requires, consumers may break. Version your contracts or use adapter patterns.

Multi-instance updates. hot_update replaces ALL instances of a factory. For gradual rollouts, combine with L3 targeted: update one tenant at a time.

Async snapshot. Both @lifecycle.snapshot and @lifecycle.restore can be async — useful for flushing pending writes before snapshot.

State size. The snapshot lives in memory during the swap. For large state (e.g., a 100MB index), consider snapshotting to disk via storage.

Consumer effects during the swap. While hot_update() is mid-flight, an async @effect on a consumer that reads the swapped service may be in the middle of a body that still holds a reference to the old instance. The kernel’s automatic batching during the lifecycle transition ensures consumer effects re-run once against the new instance after the swap completes — but if the in-flight body owns a connection or lease that the new instance also needs, mark the consumer effect with cancel_on_supersede=True. See Reactive Intent → cancel_on_supersede.

Key Takeaway

hot_update() is snapshot → teardown → replace → activate → restore. The kernel manages the lifecycle transitions. Components declare what state matters via @lifecycle.snapshot/@lifecycle.restore. Without those hooks, __dict__ copying works as a reasonable fallback. This is the only pattern that required kernel changes — and the change was 80 lines.