VM backup procedure VMware ESXi · Proxmox · Hyper-V
On-prem appliance backup is the customer\'s responsibility. The procedures below describe the recommended snapshot strategy for each supported hypervisor, with a 4h RTO and a 24h RPO target.
What to backup
The appliance stores everything that matters on the separate data disk (the system disk is replaceable in one OVA redeploy). A consistent snapshot of the data disk alone is sufficient to restore service.
- Data disk: Postgres datastore, object storage, secrets, license state, audit log. Required.
- System disk: OS + Natalia binaries. Recoverable by redeploying the same signed OVA — backup optional.
- Hypervisor metadata: VM configuration (vCPU, RAM, network, VLAN). Export the VM config file alongside the snapshot.
VMware ESXi / vSphere
- Use a backup product compatible with VMware (Veeam, NAKIVO, Vembu, etc.) targeting the VM as a whole.
- Enable VSS quiescing if available (calls
fsfreezeon the appliance for crash-consistent snapshots). - Schedule a daily full of the data disk + incremental every 6h.
- Keep: 7 daily + 4 weekly + 3 monthly (GFS).
- Backup destination off-host (NAS, separate datastore, S3-compatible object storage).
Proxmox VE
- Configure a daily backup job via Datacenter → Backup targeting the VM.
- Pick the snapshot mode (leverages qcow2 snapshots, crash-consistent).
- Enable Zstd compression to halve backup size.
- Destination: Proxmox Backup Server (PBS) on a separate host, or PBS S3 bucket. Keep 7 daily + 4 weekly + 3 monthly.
- For 5-minute RPO: use ZFS replication between two Proxmox hosts.
Hyper-V
- Use a Hyper-V-compatible backup product (Veeam, Altaro, Azure Backup Server) — Windows Server Backup is not enough for production.
- Enable VSS Integration Services in the VM settings (provides crash-consistent + application-consistent snapshots).
- Daily full + incremental every 6h.
- Destination: backup server on a separate VLAN, ideally with off-site replication (immutable Wasabi / S3 Glacier bucket).
Indicative RTO/RPO
| Backup strategy | RTO | RPO | Cost |
|---|---|---|---|
| Daily snapshot only | 4h | 24h | € |
| Daily + incremental 6h | 2h | 6h | €€ |
| ZFS / Storage replication 5 min | 30 min | 5 min | €€€ |
RTO assumes the OVA system disk is available locally (no re-download required). RPO does not count the PBX-side CDR buffering: in practice, OXE / OXO PBXs retain CDR for several hours and Natalia catches up automatically on restore.
Restore test (quarterly)
A backup that has never been restored is not a backup. Schedule a quarterly restore test in an isolated VLAN:
- Restore the last full + the last incremental on an isolated test VM.
- Boot in degraded mode (no PBX connection, no MCP exposure).
- Verify Postgres integrity:
natalia-cli check-integrity. - Verify the audit log hash chain (
natalia-cli audit verify). - Compare CDR counts to the previous business day: should match within ±0.5%.
- Destroy the test VM.