cv4pve-metrics-exporter is a Prometheus metrics exporter designed to provide comprehensive monitoring data for Proxmox VE virtualization infrastructure.
Key Features:
Exports detailed Prometheus metrics for monitoring and alerting.
Monitors node health, VM/container resource usage, storage capacity, replication status, and High Availability (HA) resources.
Offers customizable endpoints and supports API token authentication for secure access.
Audience & Benefit:
Ideal for Proxmox VE administrators seeking to enhance infrastructure monitoring. This tool enables proactive management of virtualized environments by providing actionable insights into resource utilization, system health, and HA configurations.
The software is available via winget for Windows users.
cv4pve-metrics-exporter --host=YOUR_HOST --api-token=... run # Standard (default)
cv4pve-metrics-exporter --host=YOUR_HOST --api-token=... run --fast # Fast
cv4pve-metrics-exporter --host=YOUR_HOST --api-token=... run --full # Full
Profiles comparison
Setting
Fast
Standard
Full
Cluster
HA state
✓ (no cache)
✓ (no cache)
✓ (cache 30s)
BackupInfo
✓ (no cache)
✓ (cache 10m)
✓ (cache 10m)
Node
Status (memory/swap/load/uptime + version)
✓
✓
Subscription
✓ (cache 1h)
✓ (cache 1h)
Replication
✓ (no cache)
✓ (cache 1m)
DiskSmart
✓ (cache 10m)
Guest
Balloon (1 RPC per running QEMU)
✓
Other
API instrumentation
✓
✓
> Fast profile skips all per-node calls — good for very large clusters where scrape latency matters more than per-node detail.
Features
Lock visibility — cv4pve_guest_lock{state} exploded series for clean alerting (backup, snapshot, migrate, …)
Subscription monitoring — info, exploded status and next due date
HA state — exploded series for both guests (cv4pve_ha_state) and nodes (cv4pve_ha_node_state), plus cv4pve_ha_quorate
Replication on every node — full cluster coverage
SMART disk health — wearout and health per disk (opt-in, cached)
Node version per node — cv4pve_node_version_info with version/release/repoid
Overcommit detection — cv4pve_node_cpu_assigned_cores and cv4pve_node_memory_assigned_bytes
# Create dedicated user (recommended)
pveum user add metrics@pve
# Grant read-only permissions
pveum aclmod / -user metrics@pve -role PVEAuditor
# Create token (save the secret — shown only once!)
pveum user token add metrics@pve metrics --privsep 0
Running as a Service
The binary is service-aware out of the box — it integrates natively with both systemd (Linux) and Windows SCM. Run it interactively during development, then promote the same binary to a managed service in production without any wrapper.
Linux (systemd)
Supports Type=notify — systemd is informed when the exporter is ready and gets proper graceful shutdown on systemctl stop.
# /etc/systemd/system/cv4pve-metrics-exporter.service
[Unit]
Description=Proxmox VE Metrics Exporter for Prometheus
After=network-online.target
Wants=network-online.target
[Service]
Type=notify
User=prometheus
Group=prometheus
ExecStart=/usr/local/bin/cv4pve-metrics-exporter \
--host=pve.local \
--api-token=metrics@pve!metrics=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx \
--settings-file=/etc/cv4pve/metrics-exporter.json \
run
Restart=on-failure
RestartSec=10
[Install]
WantedBy=multi-user.target
Enable and start:
# Create a dedicated unprivileged user (optional)
sudo useradd -r -s /bin/false prometheus
# Place the settings file somewhere the service user can read
sudo install -d /etc/cv4pve
sudo install -m 640 -o root -g prometheus settings.json /etc/cv4pve/metrics-exporter.json
sudo systemctl daemon-reload
sudo systemctl enable --now cv4pve-metrics-exporter
sudo systemctl status cv4pve-metrics-exporter
sudo journalctl -u cv4pve-metrics-exporter -f
> Note: always use absolute paths for --settings-file — systemd does not set a working directory by default.
Windows (native service)
The binary integrates with Windows SCM directly.
# Install as a Windows service
sc.exe create cv4pve-metrics-exporter `
binPath= "C:\Tools\cv4pve-metrics-exporter\cv4pve-metrics-exporter.exe --host=pve.local --api-token=metrics@pve!metrics=xxx --settings-file=C:\ProgramData\cv4pve\metrics-exporter.json run" `
start= auto `
DisplayName= "Corsinvest Proxmox VE Metrics Exporter"
# Start / stop
sc.exe start cv4pve-metrics-exporter
sc.exe stop cv4pve-metrics-exporter
# View logs in Event Viewer → Windows Logs → Application
> Note on quoting:sc.exe is picky — the space after binPath= and around each = is required.
Docker
Run the binary in a minimal container — no special flags needed, SIGTERM is handled.
Customize the exporter by creating and editing a settings.json file:
# Step 1 — generate a settings file (pick your starting profile)
cv4pve-metrics-exporter create-settings # Standard (default)
cv4pve-metrics-exporter create-settings --fast # Fast
cv4pve-metrics-exporter create-settings --full # Full
# Step 2 — edit settings.json to your needs
# Step 3 — run with your custom settings
cv4pve-metrics-exporter --host=YOUR_HOST --api-token=... --settings-file=settings.json run
Each collector exposes two knobs:
Enabled — turn the collector on/off
CacheSeconds — TTL of the cached result (0 = always refresh)
Cache is the recommended way to keep slow-changing data (SMART, subscription, backup info) up-to-date without hammering the Proxmox API on every scrape.
Full settings.json with all defaults (Standard profile)
cv4pve-metrics-exporter @config.rsp run
cv4pve-metrics-exporter @config.rsp --settings-file=settings.json run
cv4pve-metrics-exporter @config.rsp run --full
One token per line (option name and value on separate lines)
Lines starting with # are comments
Response files can be nested: a line starting with @ references another file
Exported Metrics
All metrics are prefixed with cv4pve_.
Show all exported metrics
Core (always on)
Metric
Type
Labels
Description
cv4pve_up
gauge
id, type
1 if resource is online/running/available
cv4pve_cluster_info
gauge
name, version
Cluster info (always 1)
cv4pve_cluster_quorate
gauge
name
1 if cluster is quorate
cv4pve_cluster_nodes
gauge
name
Number of nodes in the cluster
cv4pve_node_info
gauge
id, name, ip, level
Node info (always 1)
cv4pve_guest_info
gauge
id, vmid, node, name, type, tags, template
VM/CT info (always 1, tags sorted to prevent churn)
cv4pve_guest_lock
gauge
id, state
1 if guest matches lock state — backup/clone/create/migrate/rollback/snapshot/snapshot-delete/suspended/suspending
cv4pve_storage_info
gauge
id, node, storage, content
Storage info (always 1, content is sorted CSV)
cv4pve_storage_shared
gauge
id
1 if storage is shared across nodes
Self-monitoring (always on)
Metric
Type
Labels
Description
cv4pve_scrape_duration_seconds
gauge
—
Duration of the last scrape
cv4pve_scrape_last_success_timestamp_seconds
gauge
—
Unix timestamp of the last successful scrape
cv4pve_scrape_errors_total
counter
section
Total number of errors per scrape section
Guest (per VM/CT, always on)
Metric
Type
Labels
Description
cv4pve_guest_cpu_usage_ratio
gauge
id
CPU usage (0..1)
cv4pve_guest_cpu_cores
gauge
id
CPU cores allocated
cv4pve_guest_memory_size_bytes
gauge
id
Configured memory
cv4pve_guest_memory_usage_bytes
gauge
id
Used memory
cv4pve_guest_memory_host_ratio
gauge
id
Guest memory usage over host total (0..1)
cv4pve_guest_disk_size_bytes
gauge
id
Disk total size
cv4pve_guest_disk_usage_bytes
gauge
id
Disk used
cv4pve_guest_uptime_seconds
gauge
id
Uptime
cv4pve_guest_disk_read_bytes_total
counter
id
Total bytes read
cv4pve_guest_disk_write_bytes_total
counter
id
Total bytes written
cv4pve_guest_network_receive_bytes_total
counter
id
Total bytes received
cv4pve_guest_network_transmit_bytes_total
counter
id
Total bytes transmitted
Storage (per storage, always on)
Metric
Type
Labels
Description
cv4pve_storage_size_bytes
gauge
id
Storage total size
cv4pve_storage_usage_bytes
gauge
id
Storage used
Node Status (if Node.Status enabled)
Metric
Type
Labels
Description
cv4pve_node_uptime_seconds
gauge
node
Uptime
cv4pve_node_load_avg1
gauge
node
Load average 1 min
cv4pve_node_load_avg5
gauge
node
Load average 5 min
cv4pve_node_load_avg15
gauge
node
Load average 15 min
cv4pve_node_memory_used_bytes
gauge
node
Memory used
cv4pve_node_memory_total_bytes
gauge
node
Memory total
cv4pve_node_memory_assigned_bytes
gauge
node
Sum of configured memory of running guests on this node
cv4pve_node_swap_used_bytes
gauge
node
Swap used
cv4pve_node_swap_total_bytes
gauge
node
Swap total
cv4pve_node_root_fs_used_bytes
gauge
node
Root FS used
cv4pve_node_root_fs_total_bytes
gauge
node
Root FS total
cv4pve_node_cpu_assigned_cores
gauge
node
Sum of CPU cores allocated to running guests
cv4pve_node_version_info
gauge
node, version, release, repoid
Node Proxmox VE version (always 1)
Subscription (if Node.Subscription enabled)
Metric
Type
Labels
Description
cv4pve_node_subscription_info
gauge
node, level
Subscription info (always 1)
cv4pve_node_subscription_status
gauge
node, status
1 if matches status — active/expired/new/notfound/invalid/suspended
API Instrumentation (if ApiInstrumentation = true)
Metric
Type
Labels
Description
cv4pve_api_request_duration_seconds
histogram
method, endpoint
Duration of Proxmox API requests (path is normalized: {node}, {vmid}, {upid})
cv4pve_api_request_errors_total
counter
method, endpoint
Failed Proxmox API requests
Performance Tuning
Increase parallelism
By default the exporter runs up to 5 parallel API requests per scrape (MaxParallelRequests = 5).
"MaxParallelRequests": 10
> Don't go too high. Each parallel request is a real HTTP call to Proxmox. Values between 5 and 15 are a reasonable range. On very large clusters, prefer the Fast profile or aggressive cache TTL over increasing parallelism.
Cache TTL for slow-changing data
Set CacheSeconds per collector to avoid hammering Proxmox on every scrape:
The metric remains visible to Prometheus between refreshes — only the underlying API call is skipped.
Minimize API calls
The Fast profile skips all per-node toggles (Status, Subscription, Replication) — only the cluster-wide bulk calls plus HA. Ideal for very large clusters or high scrape frequencies.
Debug API endpoints
Enable ApiInstrumentation (default on) and look at cv4pve_api_request_duration_seconds to identify which endpoints are slowest:
# Average latency per endpoint over last 5 min
rate(cv4pve_api_request_duration_seconds_sum[5m])
/ rate(cv4pve_api_request_duration_seconds_count[5m])
# p99 latency
histogram_quantile(0.99, rate(cv4pve_api_request_duration_seconds_bucket[5m]))
Summary
Setting
Effect
Default
MaxParallelRequests ↑
Faster, but more load on Proxmox
5
*.CacheSeconds ↑
Fewer API calls for slow data, metrics held between refresh
0 / 600 / 3600 (varies)
ApiInstrumentation
Per-endpoint latency histograms — handy for tuning
on
Fast profile
Skip all per-node calls
off
Support
Professional support and consulting available through Corsinvest.