Orchestration

Container and workload management features including Slurm scheduling and Kubernetes support.

Key Requirements

  • Easy process for adding new cluster users
  • RBAC and SSO implementation
  • No SSH key copying required
  • Storage RBAC enforcement
  • Cluster sharing and metering with chargeback/showback
  • CUDA_VISIBLE_DEVICES properly configured

Slurm

Slurm

Requirements specific to managed and self-service Slurm clusters.

  • Automated setup process for Slurm
  • Automated managed Slurm service
  • Self-service capabilities (e.g., Lambda 1CC, Nebius managed Soperator)
  • Head node provisioning
  • Out-of-the-box Slurm topology configuration
  • Slurm modules availability
  • Pyxis container plugin support

Kubernetes

Kubernetes

Requirements specific to managed and self-service Kubernetes clusters.

  • Automated setup process for Kubernetes
  • Automated managed Kubernetes service
  • kubectl access or KUBECONFIG provided
  • Easy access to Kubernetes Dashboard, Lens, etc.
  • Storage accessible via PVC + hostPath + S3

All evaluation criteria