Zero-downtime chain migrations

Draft — these are my notes; edit freely before publishing.

Migrating a live blockchain network’s infrastructure off third-party providers is open-heart surgery: the patient is awake and trading the whole time. We did it with no service interruption. The playbook:

Run both, trust one

Stand up the in-house stack and run it in parallel behind the existing providers. It serves no production traffic yet — it just has to agree.

Shadow until boring

Mirror real RPC traffic to the new nodes and diff the responses. You are not looking for correctness once; you are looking for correctness boringly, for days. Disagreements are bugs or config drift — find them before users do.

Cut over at the edge

The actual switch is a load-balancer weight, shifted gradually, with an instant rollback. By the time it’s 100% in-house, nothing observable has happened — which is the entire goal.

The hard part was never the nodes. It was the indexers, bundlers, and the long tail of integrations that assumed a specific provider’s quirks.