Configuring Cache

After data is dumped to OBS, some data is cached to reduce access to OBS and improve Elasticsearch query performance. Data that is requested for the first time is obtained from OBS. The obtained data is cached in the memory. In subsequent queries, the system searches for data in the cache first. Data can be cached in memory or files.

Elasticsearch accesses different files in different modes. The cache system supports multi-level cache and uses blocks of different sizes to cache different files. For example, a large number of small blocks are used to cache .fdx and .tip files, and a small number of large blocks are used to cache .fdt files.

Table 1 Cache configurations

Parameter

Type

Description

low_cost.obs.blockcache.names

Array

The cache system supports multi-level cache for data of different access granularities. This configuration lists the names of all caches. If this parameter is not set, the system has a cache named default. To customize the configuration, ensure there is a cache named default.

Default value: default

low_cost.obs.blockcache.<NAME>.type

ENUM

Cache type, which can be memory or file.

If it is set to memory, certain memory will be occupied. If it is set to file, cache will be stored in disks. You are advised to use ultra-high I/O disks to improve cache performance.

Default value: memory

low_cost.obs.blockcache.<NAME>.blockshift

Integer

Size of each block in the cache. Its value is the number of bytes shifted left. For example, if this parameter is set to 16, the block size is 216 bytes, that is, 65536 bytes (64 KB).

Default value: 13 (8 KB)

low_cost.obs.blockcache.<NAME>.bank.count

Integer

Number of cache partitions.

Default value: 1

low_cost.obs.blockcache.<NAME>.number.blocks.perbank

Integer

Number of blocks included in each cache partition.

Default value: 8192

low_cost.obs.blockcache. <NAME>.exclude.file.types

Array

Extensions of files that are not cached. If the extensions of certain files are neither in the exclude list nor in the include list, they are stored in the default cache.

low_cost.obs.blockcache. <NAME>.file.types

Array

Extensions of cached files. If the extensions of certain files are neither in the exclude list nor in the include list, they are stored in the default cache.

The following is a common cache configuration. It uses two levels of caches, default and large. The default cache uses 64 KB blocks and has a total of 30 x 4096 blocks. It is used to cache files except .fdt files. The large cache uses 2 MB blocks and contains 5 x 1000 blocks. It is used to cache .fdx, .dvd, and .tip files.

low_cost.obs.blockcache.names: ["default", "large"]
low_cost.obs.blockcache.default.type: file
low_cost.obs.blockcache.default.blockshift: 16
low_cost.obs.blockcache.default.number.blocks.perbank: 4096
low_cost.obs.blockcache.default.bank.count: 30
low_cost.obs.blockcache.default.exclude.file.types: ["fdt"]

low_cost.obs.blockcache.large.type: file
low_cost.obs.blockcache.large.blockshift: 21
low_cost.obs.blockcache.large.number.blocks.perbank: 1000
low_cost.obs.blockcache.large.bank.count: 5
low_cost.obs.blockcache.large.file.types: ["fdx", "dvd", "tip"]
Table 2 Other parameters

Parameter

Type

Description

index.frozen.obs.max_bytes_per_sec

String

Maximum rate of uploading files to OBS during freezing. It takes effect immediately after you complete configuration.

Default value: 150MB

low_cost.obs.index.upload.threshold.use.multipart

String

If the file size exceeds the value of this parameter during freezing, the multipart upload function of OBS is used.

Default value: 1GB

index.frozen.reader.cache.expire.duration.seconds

Integer

Timeout duration.

To reduce the heap memory occupied by frozen indexes, the reader caches data for a period of time after the index shard is started, and stops caching after it times out.

Default value: 300s

index.frozen.reader.cache.max.size

Integer

Maximum cache size.

Default value: 100