A page-based approach for storing vector embeddings

Intelligent Systems and Technologies, Artificial Intelligence
Authors:
Abstract:

This study proposes a page-based approach to organize the storage for vector embeddings combined with the use of general-purpose lossless compression algorithms. The proposed approach organizes vector embeddings into pages of a configurable number of entries that contain vector embeddings and all necessary metainformation, and then the page files are compressed using general-purpose compression algorithms. This approach allows configuring page size and specific compression algorithm, to balance retrieval speed and storage efficiency. Experiments on three datasets, including PyEmb-50GB with more than 28 million dense vector embeddings, showed that the proposed solution reduces the occupied disk space by 14–40% compared to existing storage formats, such as ORC and Parquet, and up to two times compared to SQLite and H2. In addition, the suggested approach demonstrates a comparable to SQLite and H2 vector retrieval time, which is also a hundred times faster than ORC and Parquet. The results indicate that increasing the page size logarithmically reduces the storage size, while linearly increasing retrieval time. The proposed storage format supports thread-safe vector access, reducing both the necessary disk space and retrieval time, making it a robust solution for large-scale vector data management. It can also be used in approximate nearest neighbor search, provided the correct way of sharding vector embeddings between pages.