SN2O upgrade procedure 8 atomic steps with automatic rollback

Procedure for the SN2O integrator partner. The data disk is kept separate from the system disk so the new appliance can be deployed and the data disk re-attached. Failure at any step triggers an automatic rollback to the previous appliance.

Prerequisites

  • Hypervisor admin access (VMware vCenter, Proxmox, Hyper-V).
  • New signed OVA (Cosign-verified) downloaded from the SN2O partner portal.
  • License valid for at least 30 days (otherwise renew first).
  • Free space ≥ 2× the size of the data disk on the datastore.
  • A 90-minute maintenance window agreed with the customer admin.

8 atomic steps

  1. 1

    POST /upgrade/prepare

    The appliance dumps a Postgres snapshot pre-vX.sql.gz on the data disk and writes the .upgrade-state manifest. Both files survive the appliance swap.

  2. 2

    Graceful shutdown

    From the appliance admin console: shutdown the appliance. Postgres flushes the WAL, systemd stops the services in the correct order.

  3. 3

    Detach the data disk

    Via the hypervisor API: detach the data disk from the current VM (do not delete it). Note its UUID and datastore path.

  4. 4

    Boot the new appliance

    Deploy the new OVA on the same hypervisor. The appliance boots in wait for data disk mode and refuses to start the services without the data disk.

  5. 5

    Re-attach the data disk

    Attach the data disk noted at step 3 to the new VM. systemd detects the mount and triggers the natalia.service.

  6. 6

    Forward-only migrations

    The new appliance runs forward-only schema migrations. On failure, the pre-vX.sql.gz snapshot is automatically restored, the upgrade-failed flag is set and the service exits with code 1.

  7. 7

    Post-upgrade healthchecks

    The post-upgrade.sh script validates 6 healthchecks (Postgres ready, REST API ready, auth ready, MCP ready, dashboard ready, license valid) and POSTs /upgrade/success or /upgrade/failed to the SN2O dashboard.

  8. 8

    Smoke test

    Manually trigger a CDR collect via /admin/collect-now. Verify that at least one CDR is ingested, parsed and visible in the dashboard.

Automatic rollback

If step 7 or 8 fails, the SN2O dashboard receives a /upgrade/failed POST and triggers the rollback automatically:

  1. Detach the data disk from the new VM.
  2. Re-import the previous OVA (kept in the appliance store).
  3. Attach the data disk to the previous appliance.
  4. The preflight of the previous appliance detects the .upgrade-state flag and runs pg_restore pre-vX.sql.gz.
  5. The previous appliance restarts in its known-good state.

SN2O hotline

Stuck on an upgrade? Contact the Natalia integrator hotline before pushing further.

[email protected]