The Helios supercomputer is a heterogeneous system with three partitions: CPU, GPU, and INT. Each partition uses AMD or Nvidia chips to meet specific workload requirements. The system has a total of 75,264 Zen 4 cores, 440 Nvidia GH200 Grace Hopper Superchips, and 24 Nvidia H100 Tensor Core GPUs, providing a theoretical peak performance of 35 PFLOPS.
In our first look at the Helios supercomputer, we thought it looked like a big (Epyc)–little (Arm) design. That wasn’t totally accurate, and HPE and Nvidia have given us some further clarification. It’s not about big-little CPU; it’s about CPU-only partition (Epyc) and accelerated partition (using GH Superchip, which combines Grace CPU and Hopper GPU).
The system was designed to be heterogeneous; it has three partitions (two main large ones, CPU and GPU, and the third, INT, is smaller) to better meet the specific requirements of different workloads. AMD and Nvidia chips have been used independently to power the different partitions of this system.
The CPU partition (HPE Cray Supercomputing EX) has 75,264 Zen 4 cores of 4th-generation AMD Epyc processors and 200TB of DDR5 memory. That partition targets modeling and simulation workloads such as computational fluid dynamics for wind turbines and propellers, car crash simulation testing, and medical workloads such as proton therapy beam simulation for cancer treatment, among others.
The GPU partition (HPE Cray Supercomputing EX) is equipped with 440 Nvidia GH200 Grace Hopper Superchips, targeted at image-intensive computer simulations in materials engineering, solid-state physics, drug discovery, and large-scale AI training, such as generative AI.
The INT partition (HPE Cray Supercomputing XD665) is equipped with 24 Nvidia H100 Tensor Core GPUs and high-speed NVMe local storage. It is targeted for interactive work with big data, tuning AI models, and running applications that use AI for inference. The Grace CPU, which has 72 of Arm’s highest-performing Neoverse V2 cores, is connected with the H100 GPU over the high-bandwidth and coherent 900 GB/sec NVLink-C2C. This link delivers 7× the bandwidth of an x16 PCIe Gen 5 connection and gives the GPU access to almost 600GB of memory.
The workloads are not isolated to these partitions. Some scientific computing workloads run on the GPU-accelerated partitions, and AI will run very well on both accelerated partitions.
The operating system used on Helios is Red Hat OpenShift (RHOS). Memory is not shared between nodes and partitions, but all nodes/partitions use the same shared parallel file system storage (for both scratch and long-term purposes).
The new system currently provides a theoretical peak performance of 35 PFLOPS, making it the fastest supercomputer in Poland compared to systems on the TOP500 list.
FYI—in 2012 , another Helios supercomputer went online at the International Fusion Energy Research Centre (IFERC) hosted by the Japanese Atomic Energy Authority (JAEA) in Rokkasho, Japan. It achieved 1,132 PFLOPS LINPACK performance. It has 8,820 Intel Xeon E5-2600 processors and 380 Intel Xeon Phi coprocessors.