Article information

2026 , Volume 31, ¹ 2, p.104-119

Lylova S.S., Vlasov A.A., Latkin E.I., Znamenskiy I.I.

Performance study of dense matrix multiplication in the OpenBLAS library on the RISC - V architecture using vector instructions

Purpose. The purpose of the work is to evaluate the effectiveness of the OpenBLAS library when performing dense matrix multiplication (GEMM) on processors of RISC-V architecture with support of vector extension.

Methodology. The research was conducted through numerical experiments on two RISC-V single- board computers with RISC-V processors (Lichee Pi 4A with a T-Head TH1520 CPU and Banana Pi BPI-F3 with a SpacemiT K1 CPU) and a laptop with x86-64 processor (AMD Ryzen 7 5800H). The performance of the sgemm OpenBLAS function was compared with a custom optimized implementation of the matrix multiplication function called minigemm, which was specifically designed for the RVV vector extension in RISC-V to work with matrices with a non-unit step in both dimensions.

Findings. The results revealed a significant performance gap. While the x86-64 implementation achieved more than 80–90 % of its theoretical peak performance, the efficiency of sgemm from OpenBLAS on RISC-V boards was only 30–40 %. Moreover, the custom implementation of minigemm outperformed OpenBLAS by almost two times for matrices of certain sizes. It was assumed that the main bottleneck was the memory subsystem, whose bandwidth on RISC-V processors was lower than in a system with an x86-64 processor, which seriously limited the achievable performance, despite the computing potential of the processor cores.

Originality/value. This work is a critical analysis of the performance of a fundamental computing operation on available RISC-V hardware. She emphasizes that the current OpenBLAS parameters are not optimal for these specific RISC-V processors, and demonstrates that the memory subsystem, rather than vector computing units, is currently the main limiting factor.


Keywords: RISC-V, vector instructions, matrix multiplication, GEMM, OpenBLAS, high performance

doi: 10.25743/ICT.2026.31.2.008

Author(s):
Lylova Sofia Sergeevna
Position: Master student
Office: Novosibirsk State University
Address: 630090, Russia, Novosibirsk, Pirogova str., 2

Vlasov Alexander Alexandrovich
PhD. , Associate Professor
Office: Novosibirsk State University, Institute of Automation and Electrometry SB RAS
Address: 630090, Russia, Novosibirsk, Pirogova str., 2
E-mail: a.vlasov@nsu.ru

Latkin Evgeny Ivanovich
Position: Leading Engineer
Office: KNS Group LLC
Address: 123376, Russia, Moscow, Rochdelskaya str., 15, building 15
E-mail: eugene_latkin@mail.ru

Znamenskiy Ilya Igorevich
Position: Leading Engineer
Office: KNS Group LLC
Address: 123376, Russia, Moscow, Rochdelskaya str., 15, building 15
E-mail: mulz@mail.ru


Bibliography link:
Lylova S.S., Vlasov A.A., Latkin E.I., Znamenskiy I.I. Performance study of dense matrix multiplication in the OpenBLAS library on the RISC - V architecture using vector instructions // Computational technologies. 2026. V. 31. ¹ 2. P. 104-119
Home| Scope| Editorial Board| Content| Search| Subscription| Rules| Contacts
ISSN 1560-7534
© 2026 FRC ICT