Disaster Recovery & Business Continuity

Proxmox Disaster Recovery — when worst case strikes

Q: What is the difference between backup and disaster recovery?

Backup protects against data loss — you can restore individual files or VMs from a specific point in time. Disaster recovery is the planned restoration of your entire infrastructure after a complete failure, with defined RTO (recovery time) and RPO (max data loss). DR needs backup, but backup alone is not DR.

Q: What are RTO and RPO?

RTO (Recovery Time Objective) = maximum acceptable downtime. RPO (Recovery Point Objective) = acceptable data loss. Example: RTO 4h / RPO 1h means maximum 4 hours downtime and maximum 1 hour of data loss. These values define the DR strategy and thus the costs.

Q: How does a stretched Ceph cluster work?

A stretched cluster distributes Ceph nodes across two locations with low latency (typically <5ms). Data is replicated synchronously — if one location fails, the other takes over without data loss. Requirement: dedicated fiber connection and three sites for quorum (two data sites + one tiebreaker).

Q: How often should DR drills happen?

Best practice: At least semi-annual full failover test in an isolated test environment, annual full DR drill with failover to the DR site. Restore tests of individual VMs should run automated quarterly. Without regular tests, DR is just theory.

Q: Does DR also protect against ransomware?

Yes, but only with air-gapped backups or immutable backups. PBS provides client-side encryption; combined with object lock (on S3 backends), backups are protected from ransomware. Important: at least one copy offsite and immutable.

Q: What does a DR strategy cost?

Backup DR (RTO 8-24h): from €1,500 setup + monthly storage. Warm standby (RTO 1-4h): from €6,000 setup + second hosting. Hot standby (RTO <5min): from €18,000 setup + stretched cluster. We calculate TCO based on your RTO/RPO requirements and compare with the risk value (downtime costs × probability).

Hardware failure, data center outage, ransomware attack. With defined RTO/RPO values, documented failover runbooks and regular DR drills, we bring your workloads back online in minutes or hours — not days.

Leading companies worldwide trust WZ-IT

The following are trademarks of their respective owners: Proxmox VE (Proxmox Server Solutions GmbH). WZ-IT is an independent service provider and has no business, partnership, or contractual relationship with these companies. We offer independent migration, installation, hosting, and operations services.

Back to Proxmox

Why DR strategy?

Backups alone are not enough

A good backup system means you don't lose data. A DR strategy means you come back online — in a planned, defined timeframe.

Without clear RTO/RPO values, without failover runbooks and without regular tests, your "disaster recovery" is just hope.

RTO

Recovery Time Objective: How long does recovery take?

RPO

Recovery Point Objective: How much data can be lost?

BIA

Business Impact Analysis: What does one hour of downtime cost?

DR-Drill

Regular failover tests — otherwise nothing is validated.

Three DR Tiers

Which DR tier fits your business?

From cost-effective backup DR to hot standby with 5-minute failover.

Backup DR

RTO

8-24h

RPO

24h

PBS backups offsite (e.g. Hetzner Storage Box). In disaster case, restore to replacement hardware.

PBS 4.2 with S3 object storage
Encrypted offsite replication
Quarterly restore tests
Replacement hardware provisioning

Warm Standby

RTO

1-4h

RPO

1-4h

Second Proxmox setup at separate location. Asynchronous ZFS/storage replication, manual failover.

Cross-datacenter replication
ZFS send/receive or Ceph RBD
Documented failover runbooks
Semi-annual DR drills

Hot Standby (HA)

RTO

< 5 min

RPO

< 1 min

Stretched cluster across two locations. Synchronous replication, automatic failover via HA Rules.

Stretched Ceph cluster
Synchronous replication
Automatic failover (HA Rules)
Quarterly failover tests

Our Process

From risk analysis to failover drill

Risk & Impact Analysis

Which workloads are business-critical? What does one hour of downtime cost? We define RTO/RPO per workload.

DR Strategy & Architecture

Backup DR, warm or hot standby? Design of replication topology and failover mechanisms.

Implementation

Building the DR setup, configuring replication and backups, creating the runbooks.

DR Drills & Continuous Testing

Regular failover tests in test environments, automated restore validation, annual full DR drill.

What is the difference between backup and disaster recovery?

Backup protects against data loss — you can restore individual files or VMs from a specific point in time. Disaster recovery is the planned restoration of your entire infrastructure after a complete failure, with defined RTO (recovery time) and RPO (max data loss). DR needs backup, but backup alone is not DR.

What are RTO and RPO?

RTO (Recovery Time Objective) = maximum acceptable downtime. RPO (Recovery Point Objective) = acceptable data loss. Example: RTO 4h / RPO 1h means maximum 4 hours downtime and maximum 1 hour of data loss. These values define the DR strategy and thus the costs.

How does a stretched Ceph cluster work?

A stretched cluster distributes Ceph nodes across two locations with low latency (typically <5ms). Data is replicated synchronously — if one location fails, the other takes over without data loss. Requirement: dedicated fiber connection and three sites for quorum (two data sites + one tiebreaker).

How often should DR drills happen?

Best practice: At least semi-annual full failover test in an isolated test environment, annual full DR drill with failover to the DR site. Restore tests of individual VMs should run automated quarterly. Without regular tests, DR is just theory.

Does DR also protect against ransomware?

Yes, but only with air-gapped backups or immutable backups. PBS provides client-side encryption; combined with object lock (on S3 backends), backups are protected from ransomware. Important: at least one copy offsite and immutable.

What does a DR strategy cost?

Backup DR (RTO 8-24h): from €1,500 setup + monthly storage. Warm standby (RTO 1-4h): from €6,000 setup + second hosting. Hot standby (RTO <5min): from €18,000 setup + stretched cluster. We calculate TCO based on your RTO/RPO requirements and compare with the risk value (downtime costs × probability).

Industry-leading companies worldwide rely on us

What do our customers say?

Let's Talk About Your Idea

Whether a specific IT challenge or just an idea - we look forward to the exchange. In a brief conversation, we'll evaluate together if and how your project fits with WZ-IT.

E-Mail

[email protected]

Leading companies trust WZ-IT