Managing the Vector Index Cache¶
The vector retrieval engine is developed in C++ and uses off-heap memory. You can use the following APIs to manage the index cache.
View cache statistics.
GET /_vector/stats
In the implementation of the vector plug-in, the vector index is the same as other types of Lucene indexes. Each segment constructs and stores an index file. During query, the index file is loaded to the non-heap memory. The plug-in uses the cache mechanism to manage the non-heap memory. You can use this API to query the non-heap memory usage, number of cache hits, and number of loading times.
Preload the vector index.
PUT /_vector/warmup/{index_name}
You can use this API to preload the vector index specified by index_name to the off-heap memory for query.
Clear the cache.
PUT /_vector/clear/cache
PUT /_vector/clear/cache/index_name
The caching mechanism limits the non-heap memory usage when vector indexes are used. When the total index size exceeds the cache size limit, index entry swap-in and swap-out occur, which affects the query performance. You can use this API to clear unnecessary index cache to ensure the query performance of hot data indexes.