Best Practices

This section provides recommendations on configuring common parameters for Kafka producers and consumers.

Table 1 Producer parameters

Parameter

Default Value

Recommended Value

Description

acks

1

all or -1 (if high reliability mode is selected)

1 (if high throughput mode is selected)

Number of acknowledgments the producer requires the server to return before considering a request complete. This controls the durability of records that are sent. Options:

0: The producer will not wait for any acknowledgment from the server at all. The record will be immediately added to the socket buffer and considered sent. No guarantee can be made that the server has received the record, and the retries configuration will not take effect (as the client generally does not know of any failures). The offset given back for each record will always be set to -1.

1: The leader will write the record to its local log but will respond without waiting until receiving full acknowledgement from all followers. If the leader fails immediately after acknowledging the record but before the followers have replicated it, the record will be lost.

all or -1: The leader will wait for the full set of replicas to acknowledge the record. This is the strongest available guarantee because the record will not be lost even if there is just one replica that works. min.insync.replicas specifies the minimum number of replicas that must acknowledge a write for the write to be considered successful.

retries

0

Set as required.

Number of times that the client resends a message. Setting this parameter to a value greater than zero will cause the client to resend any record that failed to be sent.

Note that this retry is no different than if the client resends the record upon receiving the error. Allowing retries will potentially change the ordering of records because if two batches are sent to the same partition, and the first fails and is retried but the second succeeds, then the records in the second batch may appear first.

You are advised to configure producers so that they can be able to retry in case of network disconnections. Set retries to 3 and the retry interval retry.backoff.ms to 1000.

request.timeout.ms

30000

Set as required.

Maximum amount of time (in ms) the client will wait for the response of a request. If the response is not received before the timeout elapses, the client will throw a timeout exception.

Setting this parameter to a large value, for example, 127000 (127s), can prevent records from failing to be sent in high-concurrency scenarios.

block.on.buffer.full

TRUE

TRUE

Setting this parameter to TRUE indicates that when buffer memory is exhausted, the producer must stop receiving new message records or throw an exception.

By default, this parameter is set to TRUE. However, in some cases, non-blocking usage is desired and it is better to throw an exception immediately. Setting this parameter to FALSE will cause the producer to instead throw "BufferExhaustedException" when buffer memory is exhausted.

batch.size

16384

262144

Default maximum number of bytes of messages that can be processed at a time. The producer will attempt to batch records together into fewer requests whenever multiple records are being sent to the same partition. This improves performance of both the client and the server. No attempt will be made to batch records larger than this size.

Requests sent to brokers will contain multiple batches, one for each partition with data available to be sent.

A smaller batch size will make batching less common and may reduce throughput (a batch size of zero will disable batching entirely). A larger batch size may use more memory as a buffer of the specified batch size will always be allocated in anticipation of additional records.

buffer.memory

33554432

67108864

Total bytes of memory the producer can use to buffer records waiting to be sent to the server. If data is generated faster than it is sent to the broker, the producer blocks or throw a "block.on.buffer.full" exception.

This setting should correspond roughly to the total memory the producer will use, but is not a rigid bound since not all memory the producer uses is used for buffering. Some additional memory will be used for compression (if compression is enabled) as well as for maintaining in-flight requests.

Table 2 Consumer parameters

Parameter

Default Value

Recommended Value

Description

auto.commit.enable

TRUE

FALSE

If this parameter is set to TRUE, the offset of messages already fetched by the consumer will be periodically committed to ZooKeeper. This committed offset will be used when the process fails as the position from which the new consumer will begin.

Constraints: If this parameter is set to FALSE, to avoid message loss, an offset must be committed to ZooKeeper after the messages are successfully consumed.

auto.offset.reset

latest

earliest

Indicates what to do when there is no initial offset in ZooKeeper or if the current offset has been deleted. Options:

earliest: Automatically reset to the smallest offset.

latest: Automatically reset to the largest offset.

none: The system throws an exception to the consumer if no offset is available.

anything else: The system throws an exception to the consumer.

connections.max.idle.ms

600000

30000

Timeout interval (in ms) for an idle connection. The server closes the idle connection after this period of time ends. Setting this parameter to 30000 can reduce the server response failures when the network condition is poor.