- MRS Manager Operation Guide (Applicable to 2.x and Earlier Versions)
- Alarm Reference (Applicable to Versions Earlier Than MRS 3.x)
- ALM-28001 Spark Service Unavailable
ALM-28001 Spark Service Unavailable¶
Description¶
The system checks the Spark service status every 30 seconds. This alarm is generated when the Spark service is unavailable.
This alarm is cleared when the Spark service recovers.
Attribute¶
Alarm ID | Alarm Severity | Auto Clear |
---|---|---|
28001 | Critical | Yes |
Parameters¶
Parameter | Description |
---|---|
ServiceName | Specifies the service for which the alarm is generated. |
RoleName | Specifies the role for which the alarm is generated. |
HostName | Specifies the host for which the alarm is generated. |
Impact on the System¶
The Spark tasks submitted by users fail to be executed.
Possible Causes¶
The KrbServer service is abnormal.
The LdapServer service is abnormal.
The ZooKeeper service is abnormal.
The HDFS service is abnormal.
The Yarn service is abnormal.
The corresponding Hive service is abnormal.
Procedure¶
Check whether service unavailability alarms exist in services that Spark depends on.
Go to the MRS cluster details page and choose Alarms.
Check whether the following alarms exist in the alarm list:
ALM-25500 KrbServer Service Unavailable
ALM-25000 LdapServer Service Unavailable
ALM-13000 ZooKeeper Service Unavailable
ALM-14000 HDFS Service Unavailable
ALM-18000 Yarn Service Unavailable
ALM-16004 Hive Service Unavailable
Handle the alarms based on the troubleshooting methods provided in the alarm help.
After the alarm is cleared, wait a few minutes and check whether the alarm HetuServer Service Unavailable is cleared.
If yes, no further action is required.
If no, go to 2.
Collect fault information.
On MRS Manager, choose System > Export Log.
Contact technical support engineers for help. For details, see technical support.
Reference¶
None