Building an HA Cluster with Proxmox: High Availability Step by Step

Timo Wevelsiep•Updated: 29.06.2026

Editorial note: Versions, commands and prices may change. Please verify critical steps independently before production use. This guide does not replace individual consulting.

An HA cluster (high availability) with Proxmox VE ensures your virtual machines survive a hardware or node failure: if a server goes down, Proxmox automatically restarts the affected VMs on another, healthy node. This guide shows how such a cluster is built, what its requirements are and how to set it up step by step.

One important note up front: Proxmox HA is restart-based, not seamless. It minimizes downtime to the duration of a reboot, but does not replace cluster-wide "zero downtime". If you are looking for the basics, see What is Proxmox?.

Requirements

A production HA cluster needs a solid foundation:

At least three nodes. Quorum (the cluster's ability to make decisions) requires the majority of votes. With three nodes, the cluster survives the loss of one node and stays operational. Two nodes only make sense with a QDevice as a third vote (more on this below).
A reliable, fast network. Cluster communication runs over Corosync and is sensitive to latency. Latencies below 5 ms between nodes and a dedicated network (its own NIC) for Corosync only - separate from VM and storage traffic - are recommended.
Synchronized clocks and identical version. NTP on all nodes, identical Proxmox VE version, reachable SSH port 22 between nodes.
HA-capable storage. The VM disks must be available on multiple nodes - via distributed shared storage (Ceph) or via ZFS replication.

Step 1: Create the cluster

On the first node, create the cluster:

pvecm create my-cluster

You can check the status at any time with:

pvecm status

Step 2: Add further nodes

On each additional node, join the existing cluster (the IP is that of an existing node):

pvecm add 10.0.0.1

All nodes then appear in the web interface under "Datacenter" and share their configuration through the cluster file system pmxcfs.

Step 3: Understand and secure quorum

In a network partition, only the nodes holding the majority of votes may keep working; the losing side goes read-only. Therefore:

Three or more nodes are the clean solution.
For a two-node cluster, add a QDevice: a lightweight service (corosync-qnetd) on a third, independent system (e.g. a small VM on other hardware or a Raspberry Pi) that provides the deciding third vote.

apt install corosync-qdevice
pvecm qdevice setup <QDEVICE-IP>

Step 4: Storage for HA - Ceph or ZFS replication

The VM disks must survive the node failure. Two established approaches:

	Ceph	ZFS replication
Principle	Distributed shared storage, synchronous	Asynchronous replication on an interval
Minimum nodes	3+	from 2
Network	10 GbE+ recommended	less critical
Data freshness after failure	current (RPO ~0)	state of the last replication (RPO = interval)
Complexity	higher	low
Best for	production with high data freshness	lean setups with tolerated data loss

Rule of thumb: from three nodes with a fast network, Ceph is the gold standard. For small, cost-conscious setups, ZFS replication is a pragmatic near-HA solution - with the awareness that on failover the changes since the last replication are missing.

Step 5: Enable HA

High availability is managed by two services: the Cluster Resource Manager (pve-ha-crm, one per cluster) and the Local Resource Manager (pve-ha-lrm, one per node). In practice:

In the web interface under Datacenter → HA, add the desired VMs/containers as HA resources.
Optionally define HA groups to set which nodes a VM should preferably run on (affinity, priorities).
Set the desired target state (e.g. started).

From now on, the HA stack monitors the resources and automatically restarts them on a node failure.

Step 6: Fencing and watchdog

To prevent a seemingly dead node from still running a VM while it starts elsewhere (split-brain), Proxmox uses watchdog-based self-fencing. By default the softdog kernel module is used: if a node loses quorum, its watchdog expires and the node reboots itself before the HA resources start elsewhere. For higher requirements, a hardware watchdog or out-of-band fencing (IPMI/iLO) can be added.

Step 7: Test failover

An HA cluster is only as good as its tested failure case. Simulate a node failure (hard power-off or disconnect the node) and check whether the HA VMs restart on another node within the expected time. Also document how to fence a node manually.

Common mistakes

Two-node cluster without a QDevice. On every failure the cluster loses quorum and goes read-only - HA then does not take effect reliably.
Corosync over the storage or VM network. Latency spikes lead to quorum loss and unnecessary fencing actions. Corosync belongs on its own, stable network (ideally redundant).
Confusing HA with "zero downtime". HA means automatic restart, not seamless operation.
No fencing planned. Without a watchdog or fencing, you risk data corruption from split-brain.
Neglecting backups. HA replicates errors and ransomware too. A separate backup - for example with encrypted Proxmox backups on a Hetzner Storage Box - remains mandatory.

Operations and support

An HA cluster is manageable to set up but demanding to run: quorum design, the Corosync network, storage strategy, patch management and tested restores all have to fit together cleanly. If you are planning a Proxmox setup on Hetzner or would rather not run the cluster yourself, we handle design, build and operations - details on our Proxmox & Private Cloud page.

You'd rather not run Proxmox yourself? WZ-IT handles setup, operations and maintenance – GDPR-compliant from Germany.

Managed Proxmox →

Frequently Asked Questions

Answers to the most important questions

For stable quorum, at least three nodes are recommended. With only two nodes, the cluster loses the majority on a failure and goes read-only - which a QDevice (a third, lightweight vote on a separate system) can mitigate. For real production, three full nodes are the clean approach.

No. Proxmox HA automatically restarts a VM on another node after a node fails - so it is restart-based, not seamless. There is a short outage until the VM has rebooted. Seamless moving (live migration) is a separate feature and only works for planned maintenance, not for an unplanned failure.

Ceph is distributed, synchronous shared storage and the gold standard from three nodes with a fast network (10 GbE+); after a failure the data is current. ZFS replication is asynchronous, simpler and usable even with two nodes, but only replicates the VM disks on an interval - on failover the changes since the last replication are lost. Ceph for maximum data freshness, ZFS replication for lean setups with a tolerated RPO.

Absolutely. HA protects against hardware and node failures, not against data loss from ransomware, operator error or corrupted data - those get replicated by Ceph or replication. A separate, ideally offsite backup (e.g. with Proxmox Backup Server) remains mandatory.

Fencing ensures that a node considered failed does not keep running a VM while it restarts elsewhere (split-brain with data corruption). Proxmox uses a hardware watchdog for this (the softdog module by default): if a node loses quorum, its watchdog expires and the node reboots itself before the HA resources start elsewhere.

Building an HA Cluster with Proxmox: High Availability Step by Step

Requirements

Step 1: Create the cluster

Step 2: Add further nodes

Step 3: Understand and secure quorum

Step 4: Storage for HA - Ceph or ZFS replication

Step 5: Enable HA

Step 6: Fencing and watchdog

Step 7: Test failover

Common mistakes

Operations and support

Frequently Asked Questions

What is the minimum number of nodes for an HA Proxmox cluster?

Does high availability mean zero downtime?

Ceph or ZFS replication - which is better for HA?

Do I still need backups despite HA?

What is fencing and why does it matter?

More on Proxmox

Let's Talk About Your Idea

What is your inquiry about?

Custom platform or business software

Sovereign infrastructure or Proxmox / private cloud

Local AI, RAG or LLM infrastructure

Operations of an existing system

I'm not sure yet