Cluster Technical Requirements

Individual specs not met on any particular cluster need to be reviewed manually.

NodeCopied!

  • Bare metal, no VM solutions

  • System: 8x Nvidia H100 SXM5 from OEM

    • Allowlist

      • Dell PowerEdge XE9680

      • Supermicro AS-8125GS-TNHR

      • Supermicro SYS-821GER-TNHR

      • Other OEMs upon review

    • Denylist

      • HPE/Gigabyte

  • GPU

    • 8x Nvidia H100 GPU 80GB SXM5 (no PCIe)

    • H200, B200, etc.

  • CPU

    • 2x CPU per node

    • AMD EPIC 9004 series

    • 5th gen Intel Xeon (Emerald Rapids)

    • 4th gen Intel Xeon (Sapphire Rapids)

    • Denylist

      • 13th, and 14th generation Intel Core series

  • RAM - 1.5TiB - 2.0TiB ECC

  • Cooling - air or liquid, no immersion or anything out-of-spec for HGX systems

  • Network

    • In-band - Nvidia ConnectX-6, ConnectX-7, or BlueField-3 DPU

    • GPU - ConnectX-7 400Gbps, BlueField-3 SmartNIC

    • Management - 1 Gbps integrated NIC, ideally with BMC pass-through and visible to OS, otherwise direct BMC access

  • Storage

    • Dual boot drives in RAID-1 configuration

    • ≥2TiB NVMe SSD (no SATA)

ClusterCopied!

  • Network

    • The cluster must have at least three networks

      • Primary in-band network used for customer access, and optionally orchestration, monitoring, and storage. At least 100Gbps, ideally 400Gbps in fat-tree topology

      • Management network that is isolated from primary network, used to access IPMI, BMC, and optionally used for orchestration and monitoring if BMC pass-through is possible. 1Gbps is fine

      • High bandwidth RDMA GPU interconnect fabric, typically Infiniband fully non-blocking NDR but can be 400Gbps RoCE (fat-tree or rail-optimized topology)

      • [optional] Storage network used for high bandwidth access to network attached storage (Weka, Vast, DDN, PureData, etc.) or enough room for us to add our own

    • Approved list of in-band network switch vendors: Arista, Dell, Juniper, Nvidia

    • At least 2x100 Gbps interfaces for ≥1024 GPUs or 2x200Gbps interfaces for ≥2048 GPUs

    • In-band network must be in a non-blocking fat-tree or CLOS topology

    • Public IP address per node

    • Infiniband design must be in accordance with Nvidia SuperPOD architecture, leaf-and-spine with a max of 2,048 GPU’s (no super-spine)

  • Compute

    • Minimum 2x head nodes for DHCP, PXE boot, logging, monitoring, management

  • Storage

    • At least 4TiB of network attached NVMe storage per GPU node

      • 512TiB for a 1,024 GPU system

      • 1,024TiB for a 2,048 GPU system

    • High-availability architecture or quick failover required with multi-tenancy support and high IOPs

    • Approved vendors: DDN, PureStorage, Vast, Weka

Access/PermissionsCopied!

  • Ability to PXE boot system images

  • Access to rPDUs

  • Access to network switch management interfaces

  • BMC access on GPU and head nodes

  • Bare-metal access (no hypervisor or cpanel-type management interface)

  • If using Infiniband, must provide UFM access

  • Access to firewall if applicable

ConnectivityCopied!

  • The cluster must have at least 3 ISP connections

    • 100Gbps primary customer access Internet connection

    • 100Gbps redundant customer access Internet connection

    • Single blended IP solution is okay for customer access

    • 1Gbps management access Internet connection

  • Connections should have sufficient route diversity to ensure fault tolerance

  • HA firewall pair with support for multiple 100Gbps connections

Datacenter/EnvironmentCopied!

  • Physical or virtual inspection of the DC

  • Hosted on minimum tier 3 data center with min N+1 generator redundancy and 24 hours of diesel with continuous replenishment contract

  • Ensure physical security measures are in place to prevent unauthorized entry into DC hall and access to hardware

  • Verify adequate power capacity at max power load

  • Verify adequate cooling capacity at max thermal output of cluster

    • Min 6’ (2 3x3’ tile gap) between racks in cold/hot aisle

  • Each GPU node serviced by a redundant pair of rPDU’s

  • Verify cable runs are neatly bundled to minimize restriction of airflow

  • Verify proper hot/cold air separation measures are taken including blanking panels and airflow baffles

    • Ensure Infiniband switches that are reverse airflow are not co-located in same rack as forward airflow ethernet equipment

  • Verify space is clean and free of dust and debris

SupportCopied!

  • 24/7 remote hands support

  • Initial response time of 15 minutes or better

  • SLA terms per contract

  • Provide documentation for spare parts strategy

PerformanceCopied!

  • Must meet minimum 48 hour burn-in requirements

DocumentationCopied!

  • Full BOM

  • Rack Elevation Diagrams

  • Network architecture diagram