Evaluating the performance of Java Vector API in vector embedding operations

Information Technologies
Authors:
Abstract:

Hardware vector instructions are widely used to improve the performance of computations. The Java Vector API introduced in Java 16 allows using them portably on any platform supported by the Java Virtual Machine (JVM). In this paper, we evaluate performance benefits from rewriting typical vector search operations, such as computing distance between two vector embeddings, using the Java Vector API. We compare the performance of these vectorized implementations with semantically equivalent scalar code. Furthermore, we compare the Java Vector API with native C++ implementations, called from Java code via different Java-to-native interfaces, namely Java JNI, Project Panama (Foreign Function and Memory API), and manipulating Java JIT compiler via JVM CI and Nalim library. Benchmarking results suggest that in certain situations using Vector API can produce a measurable increase in performance of low-level operations, which can be translated into speedup of high-level algorithms such as Product Quantization. However, under certain scenarios, using Vector API is slower than relying on automatic vectorization provided by JVM, and most benchmarks suggest that invoking calculations implemented in C++ is faster even with all performance penalties incurred by native code invocations. Using techniques to lower these penalties, for example, by avoiding memory copy operations, can decrease the execution time by five times compared to Vector API and by ten times compared to plain Java code. However, in cases where using native code is prohibited, Vector API can still demonstrate a noticeable performance uplift, which can be beneficial for vector-related calculations in Java applications.