Voltage Park is a story of turnaround and redemption. If we were to have done this review in 2023 or 2024, the story would have been much different. The current Voltage Park is who we are rating, and the current Voltage Park is a reasonably less weak provider focused exclusively on H100 GPUs. As of our testing, their on-demand capacity appears to be regularly sold out. The company is shipping features at a rapid pace, including a recently launched SLURM service and OIDC integration for Kubernetes. Voltage park offers the one of the lowest price in the industry.
Our initial experience with slurm included a lot of provisioning challenges, with multiple attempts being required to spin up our test cluster. Once provisioned, we would be load balanced to different login nodes, with the now-classic SonK issue of not being able to run code. Not git, vim, nano, or sudo permissions. However, Voltage Park is the only provider with these SonK issues that seemed to be aware of them, suggesting a kubectl exec command to access the login pod, instead of the original ssh via public IP. While this container-first approach got us to workaround the initial root permission issue, it still takes time to install software, and if your connection gets reset, all software installs go away. In other words, login pods are stateless. The Voltage Park engineering team committed to building a new container image for login pods that included necessary software to run slurm jobs and edit code, and they delivered just that in under 24 hours. We were impressed by the commitment to customer support.
Source: Spinning up a SonK cluster in Voltage Park, right from the console
Inside the intended container environment, the setup is more robust. We found a correctly configured topology.conf for network-aware scheduling, SLURM prolog and epilog scripts in place, and a modern container toolkit with pyxis and enroot installed. Interconnect performance was strong, running collectives at expected bandwidth. We also saw good download speeds, and a reasonably fast shared filesystem.
Operationally, we encountered two major points of concern. First, Voltage Park’s dashboard has a “Shutdown” function that is distinct from “Terminate”. “Shutdown” halts the instances but continues to bill for the reserved capacity, a nuance that is not made sufficiently clear in the UI, and we expect is a disaster waiting to happen. Notably, not a single other provider offers these distinct “Shutdown” and “Terminate” options, and even after discussing the purpose of the “Shutdown” button with the Voltage Park team, it is still very confusing to us what the intended use case is. We recommend
Second, their process for handling hardware failures in on-demand clusters is manual, requiring operator intervention to cycle nodes out of a user’s cluster. This is a far cry from the automated, resilient systems offered by top-tier providers. This is also demonstrated by a lack of up-to-date security patches. The cluster was also pre-installed with an nvidia container toolkit version (1.17.4) that was out-of-date by 9 months, and as discussed previously in this article, victim to CVE-2025-23266 (NVIDIAScape) and CVE-2025-23267, with CVSS scores of 9.0 and 8.5 out of 10 respectively (“Critical”).
In conclusion, we believe that Voltage Park now has a solid technical foundation to carry forward and recover from reputational issues. We are encouraged by the execution of the technical team, and look forward to seeing more improvements in the future.