ALM-45177 Success Rate of Calling OBS Data Read APIs Is Lower than the Threshold

Description

The system checks whether the success rate of calling APIs for reading OBS data is lower than the threshold every 30 seconds. This alarm is generated when the success rate is lower than the threshold.

This alarm is automatically cleared when the success rate of calling APIs for reading OBS data is greater than the threshold.

Attribute

Alarm ID

Alarm Severity

Auto Clear

45177

Minor

Yes

Parameters

Name

Meaning

Source

Specifies the cluster for which the alarm is generated.

ServiceName

Specifies the service for which the alarm is generated.

RoleName

Specifies the role for which the alarm is generated.

HostName

Specifies the host for which the alarm is generated.

Trigger Condition

Specifies the threshold for triggering the alarm.

Impact on the System

If the success rate of calling the OBS APIs for reading data is less than the threshold, the upper-layer big data computing services may be affected. To be more specific, some computing tasks may fail to be executed.

Possible Causes

An execution exception or severe timeout occurs on the OBS server.

Procedure

Check the heap memory usage.

  1. On the FusionInsight Manager homepage, choose O&M > Alarm > Alarms > Success Rate for Calling the OBS Data Read API Is Lower Than the Threshold, view the role name in Location, and check the instance IP address.

  2. Choose Cluster > Name of the desired cluster > Services > meta > Instance > meta (IP address of the instance for which the alarm is generated). Click the drop-down list in the upper right corner of the chart area and choose Customize. In the dialog box that is displayed, select Success percent of OBS data read operation interface calls from OBS data read operation, and click OK. Check whether the average time of OBS metadata API calls exceeds the threshold.

    • If yes, go to 3.

    • If no, go to 5.

  3. Choose Cluster > Name of the desired cluster > O&M > Alarm > Thresholds > meta > Success Rate for Calling the OBS Data Read API. Increase the threshold or smoothing times as required.

  4. Check whether the alarm is cleared.

    • If yes, no further action is required.

    • If no, go to 5.

Collect the fault information.

  1. On FusionInsight Manager, choose O&M. In the navigation pane on the left, choose Log > Download.

  2. In the Services area, select NodeAgent, NodeMetricAgent, OmmServer, and OmmAgent under OMS.

  3. Click image1 in the upper right corner, and set Start Date and End Date for log collection to 30 minutes ahead of and after the alarm generation time respectively. Then, click Download.

  4. Contact O&M personnel and provide the collected logs.

Alarm Clearing

This alarm is automatically cleared after the fault is rectified.