Configuring an Auto Scaling Rule

Background

In big data application scenarios, especially real-time data analysis and processing, the number of cluster nodes needs to be dynamically adjusted according to data volume changes to provide the required number of resources. The auto scaling function of MRS enables the task nodes of a cluster to be automatically scaled to match cluster loads. If the data volume changes periodically, you can configure an auto scaling rule so that the number of task nodes can be automatically adjusted in a fixed period of time before the data volume changes.

  • Auto scaling rules: You can increase or decrease task nodes based on real-time cluster loads. Auto scaling will be triggered with a certain delay when the data volume changes.

  • Resource plans: Set the task node quantity based on the time range. If the data volume changes periodically, you can create resource plans to resize the cluster before the data volume changes, thereby avoiding delays in increasing or decreasing resources.

You can configure either auto scaling rules or resource plans or both to trigger auto scaling. Configuring both resource plans and auto scaling rules improves the cluster node scalability to cope with occasionally unexpected data volume peaks.

In some service scenarios, resources need to be reallocated or service logic needs to be modified after cluster scale-out or scale-in. If you manually scale out or scale in a cluster, you can log in to cluster nodes to reallocate resources or modify service logic. If you use auto scaling, MRS enables you to customize automation scripts for resource reallocation and service logic modification. Automation scripts can be executed before and after auto scaling and automatically adapt to service load changes, all of which eliminates manual operations. In addition, automation scripts can be fully customized and executed at various moments, meeting your personalized requirements and improving auto scaling flexibility.

  • Auto scaling rules:

    • You can set a maximum of five rules for scaling out or in a cluster, respectively.

    • The system determines the scale-out and then scale-in based on your configuration sequence. Important policies take precedence over other policies to prevent repeated triggering when the expected effect cannot be achieved after a scale-out or scale-in.

    • Comparison factors include greater than, greater than or equal to, less than, and less than or equal to.

    • Cluster scale-out or scale-in can be triggered only after the configured metric threshold is reached for consecutive 5n (the default value of n is 1) minutes.

    • After each scale-out or scale-in, there is a cooling duration is greater than 0, and lasts 20 minutes by defaults.

    • In each cluster scale-out or scale-in, at least one node and at most 100 nodes can be added or reduced.

  • Resource plans (setting the number of Task nodes by time range):

    • You can specify a Task node range (minimum number to maximum number) in a time range. If the number of Task nodes is beyond the Task node range in a resource plan, the system triggers cluster scale-out or scale-in.

    • You can set a maximum of five resource plans for a cluster.

    • A resource plan cycle is by day. The start time and end time can be set to any time point between 00:00 and 23:59. The start time must be at least 30 minutes earlier than the end time. Time ranges configured for different resource plans cannot overlap.

    • After a resource plan triggers cluster scale-out or scale-in, there is 10-minute cooling duration. Auto scaling will not be triggered again within the cooling time.

    • When a resource plan is enabled, the number of Task nodes in the cluster is limited to the default node range configured by you in other time periods except the time period configured in the resource plan.

    • If the resource plan is not enabled, the number of Task nodes is not limited to the default node range.

  • Automation scripts:

    • You can set an automation script so that it can automatically run on cluster nodes when auto scaling is triggered.

    • You can set a maximum number of 10 automation scripts for a cluster.

    • You can specify an automation script to be executed on one or more types of nodes.

    • Automation scripts can be executed before or after scale-out or scale-in.

    • Before using automation scripts, upload them to a cluster VM or OBS file system in the same region as the cluster. The automation scripts uploaded to the cluster VM can be executed only on the existing nodes. If you want to make the automation scripts run on the new nodes, upload them to the OBS file system.

Accessing the Auto Scaling Configuration Page

You can configure an auto scaling rule on the Set Advanced Options page during cluster creation or on the Nodes page after the cluster is created.

Configuring an auto scaling rule when creating a cluster

  1. Log in to the MRS console.

  2. When you create a cluster containing task nodes, configure the cluster software and hardware information by referring to Creating a Custom Cluster. Then, on the Set Advanced Options page, enable Analysis Task and configure or modify auto scaling rules and resource plans.

    You can configure the auto scaling rules by referring to the following scenarios:

Configure an auto scaling rule for an existing cluster

  1. Log in to the MRS console.

  2. In the navigation pane on the left, choose Clusters > Active Clusters and click the name of a running cluster to go to the cluster details page.

  3. Click the Nodes tab and then Auto Scaling in the Operation column of the task node group. The Auto Scaling page is displayed.

    Note

    • If no task node exists in the cluster, click Configure Task Node to add one and then configure the auto scaling rules.

    • For MRS 3.x or later, Configure Task Node is available only for analysis clusters, streaming clusters, and hybrid clusters. For details about how to add a task node for a custom cluster of MRS 3.x or later, see Adding a Task Node.

  4. Enable Auto Scaling and configure or modify auto scaling rules and resource plans.

    You can configure the auto scaling rules by referring to the following scenarios:

Scenario 1: Using Auto Scaling Rules Alone

The following is an example scenario:

The number of nodes needs to be dynamically adjusted based on the Yarn resource usage. When the memory available for Yarn is less than 20% of the total memory, five nodes need to be added. When the memory available for Yarn is greater than 70% of the total memory, five nodes need to be removed. The number of nodes in a task node group ranges from 1 to 10.

  1. Go to the Auto Scaling page to configure auto scaling rules.

    • Configure the Default Range parameter.

      Enter a task node range, in which auto scaling is performed. This constraint applies to all scale-in and scale-out rules. The maximum value range allowed is 0 to 500.

      The value range in this example is 1 to 10.

    • Configure an auto scaling rule.

      To enable Auto Scaling, you must configure a scale-out or scale-in rule.

      1. Select Scale-Out or Scale-In.

      2. Click Add Rule.

      3. Configure the Rule Name, If, Last for, Add, and Cooldown Period parameters.

      4. Click OK.

        You can view, edit, or delete the rules you configured in the Scale-out or Scale-in area on the Auto Scaling page. You can click Add Rule to configure multiple rules.

  2. (Optional) Configure automation scripts.

    Set Advanced Settings to Configure and click Created, or click Add Automation Script to go to the Automation Script page.

    MRS 3.x does not support this operation.

    1. Set the following parameters: Name, Script Path, Execution Node, Parameter, Executed, and Action upon Failure. For details about the parameters, see Table 4.

    2. Click OK to save the automation script configurations.

  3. Click OK.

    Note

    If you want to configure an auto scaling rule for an existing cluster, select I agree to authorize MRS to scale out or in nodes based on the above rule.

Scenario 2: Using Resource Plans Alone

If the data volume changes regularly every day and you want to scale out or in a cluster before the data volume changes, you can create resource plans to adjust the number of Task nodes as planned in the specified time range.

Example:

A real-time processing service sees a sharp increase in data volume from 7:00 to 13:00 every day. Assume that an MRS streaming cluster is used to process the service data. Five task nodes are required from 7:00 to 13:00, while only two are required at other time.

  1. Go to the Auto Scaling page to configure a resource plan.

    1. For example, the Default Range is set to 2-2, indicating that the number of Task nodes is fixed to 2 except the time range specified in the resource plan.

    2. Click Configure Node Range for Specific Time Range under Default Range or Add Resource Plan.

    3. Configure Time Range and Node Range.

      For example, set Time Range to 07:00-13:00, and Node Range to 5-5. This indicates that the number of task nodes is fixed at 5 from 07:00 to 13:00.

      For details about parameter configurations, see Table 3. You can click Configure Node Range for Specific Time Range to configure multiple resource plans.

      Note

      • If you do not set Node Range, its default value will be used.

      • If you set both Node Range and Time Range, the node range you set will be used during the time range you set, and the default node range will be used beyond the time range you set. If the time is not within the configured time range, the default range is used.

  2. (Optional) Configure automation scripts.

    Set Advanced Settings to Configure and click Created, or click Add Automation Script to go to the Automation Script page.

    MRS 3.x does not support this operation.

    1. Set the following parameters: Name, Script Path, Execution Node, Parameter, Executed, and Action upon Failure. For details about the parameters, see Table 4.

    2. Click OK to save the automation script configurations.

  3. Click OK.

    Note

    If you want to configure an auto scaling rule for an existing cluster, select I agree to authorize MRS to scale out or in nodes based on the above rule.

Scenario 3: Using Both Auto Scaling Rules and Resource Plans

If the data volume is not stable and the expected fluctuation may occur, the fixed Task node range cannot guarantee that the requirements in some service scenarios are met. In this case, it is necessary to adjust the number of Task nodes based on the real-time loads and resource plans.

The following is an example scenario:

A real-time processing service sees an unstable increase in data volume from 7:00 to 13:00 every day. For example, 5 to 8 task nodes are required from 7:00 to 13:00, and 2 to 4 are required beyond this period. Therefore, you can set an auto scaling rule based on a resource plan. When the data volume exceeds the expected value, the number of Task nodes can be adjusted if resource loads change, without exceeding the node range specified in the resource plan. When a resource plan is triggered, the number of nodes is adjusted within the specified node range with minimum affect. That is, increase nodes to the upper limit and decrease nodes to the lower limit.

  1. Go to the Auto Scaling page to configure auto scaling rules.

    • Default Range

      Enter a task node range, in which auto scaling is performed. This constraint applies to all scale-in and scale-out rules.

      For example, this parameter is set to 2-4 in this scenario.

    • Auto Scaling

      To enable Auto Scaling, you must configure a scale-out or scale-in rule.

      1. Select Scale-Out or Scale-In.

      2. Click Add Rule. The Add Rule page is displayed.

      3. Configure the Rule Name, If, Last for, Add, and Cooldown Period parameters.

      4. Click OK.

        You can view, edit, or delete the rules you configured in the Scale-out or Scale-in area on the Auto Scaling page.

  2. Configure a resource plan.

    1. Click Configure Node Range for Specific Time Range under Default Range or Add Resource Plan.

    2. Configure Time Range and Node Range.

      For example, Time Range is set to 07:00-13:00 and Node Range to 5-8.

      For details about parameter configurations, see Table 3. You can click Configure Node Range for Specific Time Range or Add Resource Plan to configure multiple resource plans.

      Note

      • If you do not set Node Range, its default value will be used.

      • If you set both Node Range and Time Range, the node range you set will be used during the time range you set, and the default node range will be used beyond the time range you set. If the time is not within the configured time range, the default range is used.

  3. (Optional) Configure automation scripts.

    Set Advanced Settings to Configure and click Created, or click Add Automation Script to go to the Automation Script page.

    MRS 3.x does not support this operation.

    1. Set the following parameters: Name, Script Path, Execution Node, Parameter, Executed, and Action upon Failure. For details about the parameters, see Table 4.

    2. Click OK to save the automation script configurations.

  4. Click OK.

    Note

    If you want to configure an auto scaling rule for an existing cluster, select I agree to authorize MRS to scale out or in nodes based on the above rule.