• MapReduce Service

mrs
  1. Help Center
  2. MapReduce Service
  3. User Guide
  4. MRS Manager Operation Guide
  5. Alarm Reference
  6. ALM-24005 Data Transmission by Flume Is Abnormal

ALM-24005 Data Transmission by Flume Is Abnormal

Description

The alarm module monitors the capacity of Flume channels. This alarm is generated when the duration that a channel is full or the number of times that a source fails to send data to the channel exceeds the threshold.

Users can set the threshold as required by modifying the channelfullcount parameter.

This alarm is cleared when the channel space is released.

Attribute

Alarm ID

Alarm Severity

Automatically Cleared

24005

Major

Yes

Parameters

Parameter

Description

ServiceName

Specifies the service for which the alarm is generated.

HostName

Specifies the host for which the alarm is generated.

ComponentType

Specifies the component type for which the alarm is generated.

ComponentName

Specifies the component name for which the alarm is generated.

Impact on the System

If the usage of the Flume channel continues to grow, the data transmission time increases. When the usage reaches 100%, the Flume agent process is suspended.

Possible Causes

  • The Flume sink is faulty.
  • The network is faulty.

Procedure

  1. Check whether the Flume sink is normal.

    1. Check whether the Flume sink is the HDFS type.
      • If yes, go to 1.b.
      • If no, go to 1.c.
    2. On MRS Manager, check whether alarm ALM-14000 HDFS Service Unavailable is reported and whether the HDFS service is stopped.
      • If the alarm is reported, clear it according to the handling suggestions of ALM-14000 HDFS Service Unavailable; if the HDFS service is stopped, start it. Then go to 1.g.
      • If the alarm is not reported and the HDFS service is running properly, go to 1.g.
    3. Check whether the Flume sink is the HBase type.
      • If yes, go to 1.d.
      • If no, go to 1.g.
    4. On MRS Manager, check whether alarm ALM-19000 HBase Service Unavailable is reported and whether the HBase service is stopped.
      • If the alarm is reported, clear it according to the handling suggestions of ALM-19000 HBase Service Unavailable; if the HBase service is stopped, start it. Then go to 1.g.
      • If the alarm is not reported and the HBase service is running properly, go to 1.g.
    5. Check whether the Flume sink is the Kafka type.
      • If yes, go to 1.f.
      • If no, go to 1.g.
    6. On MRS Manager, check whether alarmALM-38000 Kafka Service Unavailable is reported and whether the Kafka service is stopped.
      • If the alarm is reported, clear it according to the handling suggestions of ALM-38000 Kafka Service Unavailable; if the Kafka service is stopped, start it. Then go to 1.g.
      • If the alarm is not reported and the Kafka service is running properly, go to 1.g.
    7. On MRS Manager, choose Service > Flume > Instance.
    8. Click the Flume instance of the faulty node and check whether the value of the Sink Speed Metrics is 0.
      • If yes, go to 2.a.
      • If no, no further action is required.

  2. Check the status of the network between the Flume sink and faulty node.

    1. Check whether the Flume sink is the avro type.
      • If yes, go to 2.c.
      • If no, go to 3.a.
    2. Log in to the host where the faulty node resides. Run the following command to switch to user root:

      sudo su - root

    3. Run the ping Flume sink IP address command to check whether the Flume sink can be pinged.
      • If yes, go to 3.a.
      • If no, go to 2.d.
    4. Contact the network administrator to repair the network.
    5. Wait for a while and check whether the alarm is cleared.
      • If yes, no further action is required.
      • If no, go to 3.a.

  3. Collect fault information.

    1. On MRS Manager, choose System > Export Log.
    2. Contact technical support engineers for help, detail see technical support.

Related Information

N/A