Storm Performance Tuning¶
Scenario¶
You can modify Storm parameters to improve Storm performance in specific service scenarios.
This section applies to MRS 3.x or later.
Modify the service configuration parameters. For details, see Modifying Cluster Service Configuration Parameters.
Topology Tuning¶
This task enables you to optimize topologies to improve efficiency for Storm to process data. It is recommended that topologies be optimized in scenarios with lower reliability requirements.
Parameter | Default Value | Scenario |
---|---|---|
topology.acker.executors | null | Specifies the number of acker executors. If a service application has lower reliability requirements and certain data does not need to be processed, this parameter can be set to null or 0 so that you can set acker off, flow control is weakened, and message delay is not calculated. This improves performance. |
topology.max.spout.pending | null | Specifies the number of messages cached by spout. The parameter value takes effect only when acker is not 0 or null. Spout adds each message sent to downstream bolt into the pending queue. The message is removed from the queue after downstream bolt processes the message and the processing is confirmed. When the pending queue is full, spout stops sending messages. Increasing the pending value improves the message throughput of spout per second but prolongs the delay. |
topology.transfer.buffer.size | 32 | Specifies the size of the Distuptor message queue for each worker process. It is recommended that the size be between 4 to 32. Increasing the queue size improves the throughput but may prolong the delay. |
RES_CPUSET_PERCENTAGE | 80 | Specifies the percentage of physical CPU resources used by the supervisor role instance (including startup and management worker processes) on each node. Adjust the parameter value based on service volume requirements of the node on which the supervisor exists, to optimize CPU usage. |
JVM Tuning¶
If an application must occupy more memory resources to process a large volume of data and the size of worker memory is greater than 2 GB, the G1 garbage collection algorithm is recommended.
Parameter | Default Value | Scenario |
---|---|---|
WORKER_GC_OPTS | -Xms1G -Xmx1G -XX:+UseG1GC -XX:+PrintGCDetails -Xloggc:artifacts/gc.log -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=1M -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=artifacts/heapdump | If an application must occupy more memory resources to process a large volume of data and the size of worker memory is greater than 2 GB, the G1 garbage collection algorithm is recommended. In this case, change the parameter value to -Xms2G -Xmx5G -XX:+UseG1GC. |