SiFive has introduced two RISC-V CPU cores catering to high-performance and AI/ML applications. The Performance P870, a 64-bit out-of-order superscalar processor, aligns with the open-standard RISC-V instruction set architecture and is Linux- and Android-compatible. Offering a 50% performance boost compared to its predecessor, the P870 achieves a score of 18 per MHz in the SPECint2K6 benchmark, with anticipated clock frequencies exceeding 3GHz. The second core, X390, is tailored for AI and ML tasks, delivering quadruple the vector performance compared to the previous version. This development aligns with SiFive’s efforts to avoid incompatible solutions amid export control discussions in the semiconductor industry.
What do we think? SiFive’s co-founders are the inventors of RISC-V and early supporters of RISC-V International, founded in 2015. The ISA came from the University of Berkeley Professor David Patterson in 2010 and has been adopted by several organizations, but as of yet, is not being employed by DHL in consumer or workstation systems. If AMD or Intel were to offer a RISC-V processor, that might change, but it’s unlikely they will as long as the design is in open standard and they can’t put any proprietary touches in it for market differentiation.
SiFive announces pair of RISC-V offerings targeting high-performance compute and AI/ML
SiFive unveiled two RISC-V CPU cores designed for high-performance and AI/ML applications. The first core, named the Performance P870, is a 64-bit out-of-order superscalar processor that incorporates the open RISC-V instruction set. It is fully compatible with operating systems like Linux and Android. Additionally, it boasts features like an I/O memory management unit (IOMMU) and hardware support for hypervisors. Furthermore, it includes SiFive’s WorldGuard, which provides critical code and data isolation within system-on-chip.
Compared to the P670, its predecessor, the P870 delivers a substantial 50% boost in performance. The developer claims it translates to a score of 18 per MHz in the SPECint2K6 benchmark. Although SiFive hasn’t disclosed the core’s maximum clock frequency, Brad Burgess, a SiFive representative, suggested during the presentation at Hot Chips that it could reach “well in the 3GHz range.”
The primary source of the core’s performance enhancement appears to stem from its six-wide out-of-order dispatch, which efficiently handles instruction distribution compared to the older four-wide configuration used in the P670. In terms of vector extensions, the P870 retains dual 128-bit units. The P870 design allows it to be integrated into SoC configurations with up to 32 cores, organized into eight four-core clusters.
Similar to previous SiFive Performance cores, the P870 is intended for deployment in clusters that share an L2 cache pool. This design feature allows the core to support SoCs with up to 32 cores, doubling the capacity compared to its predecessor. During the Hot Chips presentation, SiFive illustrated a possible SoC configuration with eight four-core clusters interconnected through SiFive’s multi-cluster architecture.
Notably, these clusters don’t have to be homogenous, enabling the P870 to be paired with a cluster of lower-power, efficiency-focused cores, like the P470 design introduced in the past. Alternatively, they can be combined with SiFive’s Intelligence family of vector-optimized cores.
The second core introduced, named the X390, serves as the successor to SiFive’s X280 core. It aligns with the company’s Intelligence series, which specializes in accelerating large-vector instructions commonly used in AI and machine learning applications. The X280, for example, is compatible with 512-bit-wide vector registers, similar to Intel’s or AMD’s AVX-512 instructions. However, SiFive has opted for a separate implementation rather than integrating them into the general-purpose core.
The newly unveiled X390 offers a significant improvement, quadrupling the vector performance of its predecessor. This achievement is primarily the result of doubling the vector length, enabling the core to manage up to 1,024-bit-long vector registers (VLEN) with 512-bit-long data paths (DLEN). While the specific data types supported by the X390 were not disclosed in advance of the launch, the X280 was known to be compatible with data types such as Int8, Int16, Int32, FP16, FP32, and FP64, along with Q8.8 to Q15 fixed-point data types, rendering it highly suitable for AI and ML workloads.
The core’s design accommodates either single- or dual-vector arithmetic logic units, and it integrates SiFive’s Vector Coprocessor Interface Extension (VCIX). This feature allows developers to implement their own vector instructions and acceleration hardware.
Notably, a bipartisan group of US lawmakers has called for extending export controls on semiconductors to China, including the open-standard RISC-V ISA. In response to this, Calista Redmond, CEO of RISC-V International, has cautioned politicians that such measures could lead to the development of incompatible solutions and potentially stifle innovation within the industry.