VM backup procedure VMware ESXi · Proxmox · Hyper-V

On-prem appliance backup is the customer\'s responsibility. The procedures below describe the recommended snapshot strategy for each supported hypervisor, with a 4h RTO and a 24h RPO target.

What to backup

The appliance stores everything that matters on the separate data disk (the system disk is replaceable in one OVA redeploy). A consistent snapshot of the data disk alone is sufficient to restore service.

  • Data disk: Postgres datastore, object storage, secrets, license state, audit log. Required.
  • System disk: OS + Natalia binaries. Recoverable by redeploying the same signed OVA — backup optional.
  • Hypervisor metadata: VM configuration (vCPU, RAM, network, VLAN). Export the VM config file alongside the snapshot.

VMware ESXi / vSphere

  1. Use a backup product compatible with VMware (Veeam, NAKIVO, Vembu, etc.) targeting the VM as a whole.
  2. Enable VSS quiescing if available (calls fsfreeze on the appliance for crash-consistent snapshots).
  3. Schedule a daily full of the data disk + incremental every 6h.
  4. Keep: 7 daily + 4 weekly + 3 monthly (GFS).
  5. Backup destination off-host (NAS, separate datastore, S3-compatible object storage).

Proxmox VE

  1. Configure a daily backup job via Datacenter → Backup targeting the VM.
  2. Pick the snapshot mode (leverages qcow2 snapshots, crash-consistent).
  3. Enable Zstd compression to halve backup size.
  4. Destination: Proxmox Backup Server (PBS) on a separate host, or PBS S3 bucket. Keep 7 daily + 4 weekly + 3 monthly.
  5. For 5-minute RPO: use ZFS replication between two Proxmox hosts.

Hyper-V

  1. Use a Hyper-V-compatible backup product (Veeam, Altaro, Azure Backup Server) — Windows Server Backup is not enough for production.
  2. Enable VSS Integration Services in the VM settings (provides crash-consistent + application-consistent snapshots).
  3. Daily full + incremental every 6h.
  4. Destination: backup server on a separate VLAN, ideally with off-site replication (immutable Wasabi / S3 Glacier bucket).

Indicative RTO/RPO

Backup strategy RTO RPO Cost
Daily snapshot only4h24h
Daily + incremental 6h2h6h€€
ZFS / Storage replication 5 min30 min5 min€€€

RTO assumes the OVA system disk is available locally (no re-download required). RPO does not count the PBX-side CDR buffering: in practice, OXE / OXO PBXs retain CDR for several hours and Natalia catches up automatically on restore.

Restore test (quarterly)

A backup that has never been restored is not a backup. Schedule a quarterly restore test in an isolated VLAN:

  1. Restore the last full + the last incremental on an isolated test VM.
  2. Boot in degraded mode (no PBX connection, no MCP exposure).
  3. Verify Postgres integrity: natalia-cli check-integrity.
  4. Verify the audit log hash chain (natalia-cli audit verify).
  5. Compare CDR counts to the previous business day: should match within ±0.5%.
  6. Destroy the test VM.