# CoreWeave (Platinum) — ClusterMAX GPU Cloud Review
> CoreWeave earns a ClusterMAX 2.0 Platinum rating from SemiAnalysis. Since our last article, CoreWeave has made some significant announcements. They raised $1.5B in an IPO on the NASDAQ, trading under $CRWV, and their share price is up over 200% in 6 months. They have announced three expansions with…
- **Provider**: CoreWeave
- **ClusterMAX Tier**: Platinum
- **Tier definition**: Best in class. Consistently excels across all evaluation criteria and commands a pricing premium.
- **Authors**: Jordan Nanos, Daniel Nishball, Dylan Patel (SemiAnalysis)
- **Published**: 2025-11-06 (Nov 06, 2025)
- **Last updated**: 2025-11-06 (Nov 06, 2025)
- **Source**: ClusterMAX 2.0
- **Canonical URL**: https://www.clustermax.ai/cloudreview/coreweave
- **Source article**: https://newsletter.semianalysis.com/p/clustermax-20-the-industry-standard
- **Topics**: CoreWeave review, CoreWeave GPU cloud, CoreWeave ClusterMAX rating, CoreWeave Platinum, Platinum tier GPU cloud, GPU cloud review, neocloud review, CoreWeave GB200 NVL72, CoreWeave GB200, CoreWeave GB300, CoreWeave B200, CoreWeave H100, GB200 NVL72 cloud, GB200 cloud, GB300 cloud, B200 cloud, H100 cloud, InfiniBand, RoCE, Kubernetes, Slurm, NCCL, DCGM, bare metal, managed Slurm, ClusterMAX 2.0, SemiAnalysis
---
Since our last article, CoreWeave has made some significant announcements. They raised $1.5B in an IPO on the NASDAQ, trading under $CRWV, and their share price is up over 200% in 6 months.
They have announced three expansions with OpenAI: $11.9B in March, $4B in May, and $6.5B in September, bringing the total commitment to $22.4B. These deals are targeting training. CoreWeave has also landed Meta as a new customer in September, singing a $14.2B deal over 6 years to 2031.
They have also announced an all-stock acquisition of Core Scientific in July for $9B, which represents a 1.3GW expansion of datacenter footprint. They also committed £1.5B to the UK, and $6B in Lancaster, Pennsylvania.
Note: for a blow-by-blow tracker of CoreWeave bringing capacity online, readers can see our datacenter industry model:
They acquired Weights & Biases in May, and OpenPipe in September. We will discuss the W&B integration below.
They also launched CoreWeave Ventures, a fund to invest in AI companies: . We believe this is in direct response to strategic investment practices such as AWS Activate (up to $300k in credits), Microsoft for Startups (up to $150k in credits), Google for Startups Cloud Program (up to $350k in credits), and the NVIDIA Inception + NVentures program.
CoreWeave has announced some of the world’s first GB200 NVL72 and GB300 NVL72 deployments, as well as some RTX 6000 Pro Blackwell instances, setting the stage for future diversification with Rubin CPX.
In terms of our testing, CoreWeave’s clusters meet all criteria, and set the bar for other Neoclouds to follow. It is for this reason that we are aware of multiple examples where CoreWeave is able to command a higher price for managed slurm or Kubernetes clusters (by roughly 10-15%, per GPU-hr) vs their direct competition such as Nebius, Fluidstack, Crusoe, Lambda, and Together.ai. Indeed, CoreWeave’s pricing is closer to the pricing of the big 4 hyperscalers. The main challenge that we perceive for CoreWeave going forward is to continue innovating and differentiating vs their competition, so as to to maintain this pricing power. We believe they will be successful doing so for the GB200 and GB300 NVL72 generation.
In this section we will talk about what is new at CoreWeave, and how they keep setting the bar for others in the industry to follow.
* Slurm-on-Kubernetes
* Bare Metal Provisioning
* Use of DPUs
* Security
* Monitoring and Health Checks
* W&B Integration
* Storage
* TCO
* Services
* Customers
#### Slurm-on-Kubernetes
At this point CoreWeave has consolidated to three offerings:
1. CoreWeave Managed SUNK (Slurm on Kubernetes)
2. CoreWeave Managed Kubernetes
3. CoreWeave Bare Metal without any managed scheduler
Since the original release of SUNK in October of 2023, uptake has been strong for new clusters, which has led to the deprecation of their slurm-on-bare metal service. All new customers that prefer slurm are pushed to SUNK. Earlier in this article, we discussed the slurm-on-Kubernetes trend at large, where SUNK is the most mature option vs Soperator and Slinky. CoreWeave developed SUNK from scratch, controls the roadmap, and has built deep integration with their underlying Kubernetes runtime CKS, as well as their monitoring dashboards (mainly Grafana, branded as CoreWeave Observe™), health check system, and provisioning system (branded as CoreWeave Mission Control).
In addition, CoreWeave has gone beyond open-source slurm to develop their own custom fork of the popular job scheduler. Specifically, in open source Slinky, there are memory leaks in the REST API of the slurm controller which leads to issues if, for example, some user is trying to queue 100,000 jobs.
Specifically, the way that slurm works at scale is through the concept of priorities. In other words, there is no way for a big pretraining job with top priority to be auto-scaled down, and there should always be spare nodes available to be added to this job in the event of an interruption. But while that job is running, users can tag smaller research/experiment slurm jobs as preemptible, and run them on the medium or low-priority partition. Functionally, this means that the slurm scheduler hands off information to kubernetes, which labels the kubernetes job to associate it to the partition. In effect, frontier labs have a giant backlog of low priority sweeps that can always take up extra nodes: trying out the latest learning rate, data mix etc. to see if it improves their research.
As a result, CoreWeave re-wrote the logic in slurm’s REST API in go, and is now using an RPC-based login pod controller for SUNK that is more performant at scale.
Interestingly, we are aware of direct licensing deals that CoreWeave has done with end users who want to run SUNK on managed Kubernetes clusters outside of CoreWeave. While we don’t believe SUNK licensing is a meaningful revenue driver for CoreWeave, it is an indication of the quality of the customer experience when using SUNK and a testament to their engineering effort.
#### Bare Metal Provisioning
A significant difference between CoreWeave and other cloud providers is their use of bare metal machines for both control plane and worker node services. Since basically all CoreWeave customers use whole 8-way machines in standard HGX (or 4-way machines in NVL72 racks), there is no need to virtualize a machine into multiple 1, 2, or 4-way GPU instances.
However, other providers like Nebius and Crusoe who also don’t split up GPU machines continue to use kubevirt and cloud-hypervisor respectively in order to realize other benefits of VMs: shared block storage (resizing, quick provisioning, PXE boots, backup, clone, restore, etc.) and network isolation.
Since VMs are easier/quicker to spin up and down than bare metal machines (i.e. the underlying OS doesn’t need to change between tenants) CoreWeave has a challenge to address: how to quickly replace and repair a broken machine in a large cluster. To address this, CoreWeave has developed the Fleet Lifecycle Controller and Node Lifecycle Controller which are used by their FleetOps and CloudOps teams to provision machines through their service CoreWeave Mission Control.
This custom stack is actually using a CRD called a NodeSet on kubernetes that defines bare metal nodes as a kubernetes resource before they are even online. Furthermore, CoreWeave has developed custom operators, similar to a mix of a DaemonSet and ReplicaSet, for functions like idle checking and debugging. This customization extends to Protected Rolling Updates, which are aware of the slurm state and wait for nodes to drain before rolling out updated pods. Effectively, nodes that have been repaired after a hardware failure sit in a queue, waiting to be added to a logical cluster on the multi-tenant backend network fabric.
#### Use of DPUs
Due to the use of bare metal provisioning, CoreWeave must contend with the same challenges as AWS with Nitro or Azure with Boost. In other words, how to implement secure multi-tenant isolation for both the frontend and backend networks. It is important to note that these tenancy challenges exist even in the scenario where a customer has rented an entire datacenter in a wholesale bare metal manner. Tenants still hire and fire interns, consultants, partner companies, academic collaborators, and have general interest in implementing isolation between groups of users on the same underlying hardware.
As a result, CoreWeave uses Nvidia BlueField DPUs on every node to offload functions traditionally handled by a hypervisor on the host CPU (VPCs, encryption, network isolation, NAT gateways, etc.). Using a distributed NAT gateway and distributed storage gateway architecture eliminates a common central performance choke point. AI workloads are “bursty” since individual research jobs or autoscaling inference endpoints can randomly start pulling massive model weight files simultaneously. Switching from centralized to distributed gateway services on DPUs guarantees line-rate performance to WAN or storage.
On Kubernetes, this actually gets implemented via a CRDs and custom controllers, which allows for bare metal nodes to move between tenants while preserving network isolation and policy. The DPU becomes the enforced boundary, not the hypervisor. Effectively, the API for programming this layer is exposed to CoreWeave’s team via `kubectl get dpuconfigurations.nimbus.infra.coreweave.com`
On the backend network, InfiniBand network isolation is implemented by changing PKeys (Partition Keys) per tenant, which is the hardware-enforced mechanism similar to an Ethernet VLAN. CoreWeave is one of the only providers to offer SHARP, which can significantly performance for certain collectives. However, for the highest security customers, CoreWeave does not allow multi-tenant InfiniBand with SHARP enabled.
[](https://substackcdn.com/image/fetch/$s_!Qhje!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64268219-bec7-4ebe-91be-5e7fbc0a293c_935x534.png)Source: CoreWeave
#### Security
When it comes to security, CoreWeave is the only Neocloud with a hyperscaler mentality. This means zero-trust policies, strict isolation between tenants, and continuous audits.
For example, while some other Neoclouds provide direct access to server BMCs, CoreWeave treats the BMC, [which is a huge attack surface](https://developer.nvidia.com/blog/analyzing-baseboard-management-controllers-to-secure-data-center-infrastructure/), with extreme caution. Host-to-BMC Access is Disabled: Both the KCS (Keyboard Controller Style) and NIC (RNDIS Ethernet over USB) interfaces from the host OS to the BMC are disabled. Effectively, the BMC management network is isolated to only communicate with the control plane (N/S side), with Layer 2 Isolation enforced to prevent lateral BMC-to-BMC movement (E/W side).
For very large customers, BMC access is via a dynamic jumpbox setup that utilizes the dedicated BMC network. An ACL on the DPU is updated by the Fleet Lifecycle Controller via NIMBUS and ensures they can only access BMC IPs in their specific tenant.
Another critical decision is avoiding scenarios where multiple customers run on the same machine. Effectively, customers can attack their own machine as much as they want, as seen in the container escape scenario, but if a node goes down for repair or is moved between tenants, it gets PXE booted to a clean state before another tenant can run on it. This is handled by the Fleet Lifecycle Controller (FLCC) and Node Lifecycle Controller (NLCC).
In terms of container escapes, future exploits are expected in upstream nightly branches from projects like pytorch, transformers, vllm, sglang, etc. and CoreWeave is in the process of switching to using ChainGuard images as the basis for all customer images. We’ve seen a common trend in this space following Broadcom’s acquisition of VMware, and the subsequent price increase to Bitnami. Notably, CoreWeave’s container images have become a standard for many other neoclouds, from login pods to nccl-test examples, many other providers are building FROM the coreweave image, or just providing scripts that pull the image from CoreWeave’s gcr registry directly:
More on security:
* Defense-in-Depth and Risk Modeling: Security is built on a comprehensive threat model that drives a defense-in-depth strategy, informing a secure-by-default application development lifecycle and deep runtime and infrastructure hardening.
* Application and Code Security: The Application Security Maturity Framework mandates that all code changes undergo secret scanning, SAST, SCA, and DAST with strict remediation SLAs. Risk is blocked pre-production via pre-commit hooks, policy-as-code, CI/CD enforcement, Chainguard base images, and golden service templates.
* Production Infrastructure and Access: Teleport governs privileged access with customer approvals, RBAC, TPM-backed node joins, and session logging (including keystrokes/syscalls), actively eliminating traditional SSH.
* Fleet Integrity and Attestation: SPDM-based firmware attestation, Secure Boot, and Measured Boot validate fleet integrity from power-on, ensuring only cryptographically verified firmware runs and enabling remote attestation prior to workload scheduling.
* Data and Workload Encryption/Identity: SecVault PKI infrastructure supports encryption for data (object storage, databases, APIs), while mTLS adoption will bind endpoint identity to trusted firmware. Cross-cluster JWT authentication and SPIFFE integration secure workload-to-workload communication.
* Continuous Monitoring and Posture: Eclypsium and Wiz provide continuous security posture management, including firmware vulnerability scanning and cloud workload posture, while telemetry pipelines ensure policy compliance and deviation detection.
* Enterprise Identity and Data: Security mandates phishing-resistant MFA and device trust for all users, with Kolide enforcing device posture checks. Policy-as-code Okta rules govern access, and systems like Cyera, Proofpoint, and Netskope manage data governance and DLP controls.
The feedback is here is the security concerns from CoreWeave is way to limiting for a lot of power users. For example as default, systemd is not available, and a lot of CPU and GPU profiling tooling is also not available due to the restrictive security concerns. This has led to an strange design choices from end users such as running background processes inside tmux shells instead of using system.
#### Monitoring and Health Checks
CoreWeave’s monitoring and health check systems are a key differentiator, and the primary line item being to quantified in TCO discussions with large customers who question the CoreWeave pricing premium. In other words, users running at 10k+ GPU scale understand the impact that interruptions can have on their training jobs.
CoreWeave recognizes that standard Nvidia (DCGM) exporters do not expose all critical metrics, such as certain thermal sensors vital for diagnosing subtle hardware issues like failing thermal paste. CoreWeave developed proprietary exporters using the lower-level NVML library. This provides the necessary granularity for robust node-level health validation.
In addition, they have also built exporters for the interconnect fabric. To identify transient physical-layer problems like signal integrity degradation between compute trays and switches, CoreWeave developed a sophisticated correlation engine. We have noticed that other vendors do not run sustained multi-node jobs during burn-in and during active scheduled health checks, instead running single node jobs, collectives, and GPU stress tests ([GPU burn](https://github.com/wilicc/gpu-burn), [GPU fryer](https://github.com/huggingface/gpu-fryer), or [Multi-node ubergemm](https://docs.nvidia.com/datacenter/dcgm/latest/user-guide/dcgm-multinode-diagnostics.html#supported-tests-for-multi-node-diagnostics)) individually. The point is that failures are often caused by the **simultaneous** thermal expansion and contraction of the GPUs and the interconnect. This is especially important for NVL72 rack-scale architectures.
By tracking events like simultaneous XID or SXID errors across the fabric, CoreWeave can automatically root-cause many failure types. For instance, if multiple compute trays connected to a single switch port report errors, the switch is flagged, while if an error follows a specific tray after it has been moved, the tray or its cabling is flagged. Simple intuition like this builds with experience, which CoreWeave has now been building for months with the GB200 and GB300 NVL72 rack-scale systems that have posed so challenging for others to operate.
[](https://substackcdn.com/image/fetch/$s_!Ok0z!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ddb3f17-6379-4995-8c9e-d3bea778e4f9_936x636.png)Source: CoreWeave simulating some errors on our cluster
Meta’s Llama 3 paper is the most clear description of where reliability issues can manifest in a training campaign. Over a 54 day period when training Llama 3 405B, using 8 pods of 3,072 H100s, with 16k of 24k GPUs in use at a given time, there were 466 job interruptions (47 of which were planned upgrades), resulting in 419 GPU server failures, and three instances of heavy manual intervention.
This is an implied MTBF of 2,111 H100-days, and we can assume that “heavy manual intervention” means restarting the job from the last checkpoint.
This example highlights that even small improvements in hardware stability and interconnect performance can shave days or even weeks off of a multi-month training schedule for a large model. Hardware failures also stand out as the #1 source of frustration amongst researchers that we talk to.
[](https://substackcdn.com/image/fetch/$s_!WKIC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d9eb732-36a5-4a60-8cbe-9b929f2a182b_936x549.png)Source: The Llama 3 Herd of Models
This section of this paper has also become infamous for two other reasons: a mention of the “diurnal 1-2% throughput variation based on time-of-day” (i.e. GPUs get hotter in the middle of the day and perform worse) and a comment on how “tens of thousands of GPUs may increase or decrease power consumption at the same time (…) can result in instant fluctuations of power consumption across the data center on the order of tens of megawatts, stretching the limits of the power grid” which resulted in the accidental upstream of the PYTORCH_NO_POWERPLANT_BLOWUP=1 environment variable at some point by a Meta engineer.
[](https://substackcdn.com/image/fetch/$s_!ETih!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03b0c4ca-8a5f-41c4-9d6d-76dab48edf85_930x308.png)Source: The Llama 3 Herd of Models
In summary, we continue to treat CoreWeave’s monitoring dashboard, passive health check approach, and suite of active health checks as the standard when working with other Neoclouds on their reliability challenges. To read more about these details, see:
#### W&B Integration
Prior to the acquisition, Weights & Biases (W&B) had become the industry leader in user-facing metrics for job scheduling, but pricing was quite high and the company seemed to be innovating at a loss. Clearly, CoreWeave noticed a big overlap between their customers and W&B’s enterprise customers. Specifically, some of these customers have been called out for having absolutely massive logging infrastructure behind their W&B deployments, effectively abusing the system.
The obvious first step in W&B integration was to integrate infrastructure level metrics, like OOME’s or AER’s into the W&B dashboard so that users can see them. Generally, individual researchers don’t get access to the cluster-level grafana dashboard, but there is still lots of useful information that they can use, specifically from the underlying Nvidia DCGM.
#### Storage
CoreWeave’s storage offerings have matured over time to include a native Object Storage offering, “CAIOS” (CoreWeave AI Object Storage) and a local cache, “LOTA” (Local Object Transfer Accelerator).
LOTA is a transparent, distributed cache that lives directly on the local NVMe of every GPU node. Public benchmarks show it reaching a sustained throughput of over 7GB/s per GPU on Blackwell.
Effectively, with LOTA, the user doesn’t need to cp or rsync anything. They simply point their S3-compatible application to the cwlota.com endpoint instead of the primary CAIOS endpoint, cwobject.com. LOTA then manages the caching of data onto the local NVMe in a distributed manner across the cluster.
According to list prices, for example, a 1,204 GPU cluster at $3/hr for 1year will cost $26.9M.
However, for storage, 1PB of active data at $0.11/GB per month, costs $1.3M or 4.6% of the total BOM.
In general, we rarely see storage costs exceed 5% of the total cluster cost.
Over time, we expect hyperscalers like AWS, Azure, GCP, and Oracle to follow the current trend of reducing their price per GB per month on their Object Storage offerings (driven by competition from CloudFlare R2 in many cases), reduce or remove punitive egress fees, and reduce or remove costs associated with storage operations like data access.
#### Support
In our experience, and when talking directly to users of CoreWeave services, we get the impression that team members are empowered, excited to help customers, and proud to work at the company. Notably, all datacenter technicians are CoreWeave employees, go through standard company training, and have equity in the company. CoreWeave is one of the only Neoclouds that has not augmented their capacity with someone else’s GPUs, maintaining vertical integration for all their facilities.
In addition, CoreWeave’s “direct to expert” support model means that all customers get quick responses, at no additional cost. But this isn’t always the case! Recently, CoreWeave actually sent us a ClearFeed notification in slack because their annual company offsite might result in some delayed support responses.
[](https://substackcdn.com/image/fetch/$s_!xqlJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd98fcd18-383d-40fe-b664-de6d617bab40_578x280.png)Source: CoreWeave
Last time, we recommended that CoreWeave work on a UI console flow for deploying their managed slurm solution, ideally with less than four button clicks. CoreWeave has basically achieved this, though it does seem like the 30+ datacenters on screen are mostly greyed-out.
[](https://substackcdn.com/image/fetch/$s_!Kh7_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84800a5c-6b9a-4832-865f-9bcfb71e8f05_936x508.png)Source: CoreWeave
#### Total Cost of Ownership (TCO)
In summary, the result of CoreWeave’s execution for Slurm-on-Kubernetes, Bare Metal Provisioning, Use of DPUs, Security, Monitoring and Health Checks, W&B Integration, Storage, and Support provides CoreWeave an ability to argue for lower TCO vs their competition, and command a pricing premium on a per GPU-hr basis.
Our feedback to CoreWeave as they scale is to continue to work on the on-demand, self-serve cluster experience, continue to develop autoscaling features for supporting inference at scale, and to maintain their lead in reliability for the NVL72 rack-scale deployments. From the customer perspective, a downside of CoreWeave continues to be that they do not offer on-demand instances or autoscaling, and rarely accept short-term rentals. This is different from Nebius and Crusoe, and limits the potential upside associated with high margin “spot instance” markets.
---
Other Platinum tier providers:
Full ClusterMAX 2.0 + 2.1 index: https://www.clustermax.ai/cloudreview
Full LLM dump of all reviews: https://www.clustermax.ai/llms-full.txt