Pool GPU capacity -- including spot -- across every AWS account and region into one elastic Kubernetes pool. Cross-account Karpenter provisioning, a free same-AZ east-west fabric, and EFA training islands. It all runs inside your own AWS accounts -- nothing phones home.
us-east-1 is exhausted while spot sits idle in us-west-2 or eu-west-1. Capacity is real -- it is just in a region your cluster cannot reach.
GPU quotas are per-account and per-region. One team hits a wall while another account has headroom -- and Karpenter cannot cross the boundary to use it.
Spot GPUs are up to ~70% cheaper, but availability swings by AZ, region, and instance type. Without pooling across your estate, you pay on-demand or you wait.
Karpenter natively provisions only within one account and region. The CloudSpectra Karpenter provider extends it across accounts and regions -- so a pending GPU pod gets a node wherever capacity actually exists.
FabricDiscover and schedule onto spot GPU capacity across every enrolled account, region, AZ, and instance type. Higher fill rate, lower cost, with on-demand fallback for the workloads that need it.
FabricOne per-workload intent -- throughput, tightly-coupled, or balanced -- compiles to the right Kubernetes labels and affinity. AZ-IDs are normalized across accounts so locality is physically true, not just name-matched.
FabricTightly-coupled training lands in a single-AZ shared VPC with a cluster placement group and EFA, so NCCL all-reduce runs over RDMA at full fabric speed. Multi-account GPU nodes join one shared subnet via AWS RAM -- not peering.
FabricInference, batch, sweeps, and data movement ride an automated full-mesh VPC-peering fabric -- with route propagation across accounts and regions. Same-AZ east-west traffic is $0/GB, versus the $0.02/GB Transit Gateway processing fee.
FabricFabric pools your own GPUs in your own accounts -- your quotas, your reservations, your spend. Not a third-party GPU cloud. No training data, model weights, or traffic ever leave your trust boundary.
Fabric+-----------------------------------------------------+ | CloudSpectra Fabric -- Control Plane | | Cross-Account Karpenter | Transit Manager | | K8s Control Plane | Placement Policy (AZ-ID aware) | +-----------------------------------------------------+ | | tightly-coupled training loosely-coupled / inference | | +---------------+----------------+ +-----------+------------------------+ | EFA Training Island | | Free Peering Capacity Fabric | | Shared VPC (AWS RAM), 1 AZ | | Routed TCP / ENA over peering | | Cluster placement group + EFA | | Accounts x Regions, $0/GB same-AZ | | | | | | GPU nodes: Acct A + Acct B | | Inference / batch / HPO sweeps | | NCCL all-reduce over SRD/RDMA | | Spot pooled where available | +------------------------------+ +----------------------------------+
CloudSpectra Fabric gives your workloads reach and placement: it finds GPU capacity across your accounts and regions, wires the network, and lands each job in the right regime. It does not -- and an orchestration layer should not pretend to -- bid spot strategy, rightsize your fleet, or manage Savings Plans and Reserved Instances. And to be precise about the hardware: EFA traffic is non-routable and cannot cross VPCs, AZs, or regions, so the highest-end tightly-coupled training runs inside a single-AZ shared-VPC island -- never over the cross-region peering fabric. The fabric carries everything else. That honesty is the point: it is why the behavior matches the diagram.