Instance Types
Cluster RequirementsCopied!
San Francisco Compute lists clusters from a number of providers that have passed our vetting process and signed a contract with us that provides a strong SLA, which we then pass onto you.
For example, here are some of the requirements for listing a training cluster, such as the h100i
instance type. If you're interested in listing, please contact us for full cluster requirements.
-
1 TB of RAM minimum
-
2x AMD EPYC 9004 or Intel 4th gen Xeon CPU
-
Three types of networks
-
100 Gbps primary in-band network for orchestration
-
1 Gbps Out of band IPMI management network isolated from the primary network
-
400 Gbps RDMA compute fabric, typically InfiniBand
-
-
At least 1 Gbps per node of internet bandwidth
-
Managed switches from an approved list of vendors
-
Air or water cooling, no immersion cooling
-
Proper 48 hour burn-in
-
At least 2TB of high IOPS local NVMe storage per GPU node
Getting a test nodeCopied!
In rare cases, we can provide test nodes before you purchase from San Francisco Compute. However, the advantage of SFC is that you can buy whatever configuration you'd like for a short time period in order to run tests. We would encourage you to try this out first, with the knowledge that you're covered by an SLA.
Instance TypesCopied!
Today, we only support one instance type: h100i
. These are clusters with NVIDIA H100s. They have 3.2TB/s InfiniBand, fully interconnected on a single RDMA fabric.
You can purchase one by running:
sf buy -t h100i -n 1 -s 'tomorrow at 10am' -d '1d'
Hardware Failures and RefundsCopied!
Expect your cluster to break. Failure rates on large scale GPU clusters are far higher than what you may be used to on web servers. Fear not! When (not if) a portion of your cluster breaks, we will attempt to hot swap you a replacement. If we can’t, we’ll refund the purchase.
On a normal GPU cloud, that would be it, you simply have one less node. However, SFC is a market. That means, in many cases, you can just buy another node with your refund. The price is not guaranteed to be the same as when you bought it, but we think this is a better experience than simply being out of luck.