Priority-based Scheduling¶
A pod priority indicates the importance of a pod relative to other pods. Volcano supports pod PriorityClasses in Kubernetes. After PriorityClasses are configured, the scheduler preferentially schedules high-priority pods.
Prerequisites¶
A cluster of v1.19 or later is available. For details, see Creating a CCE Standard/Turbo Cluster.
The Volcano add-on has been installed. For details, see Volcano Scheduler.
Overview¶
The services running in a cluster are diversified, including core services, non-core services, online services, and offline services. You can configure priorities for different services based on service importance and SLA requirements. For example, configure a high priority for core services and online services so that such services preferentially obtain cluster resources.
Table 1 lists the priority-based scheduling supported by CCE clusters.
Scheduling Type | Description | Scheduler |
---|---|---|
Priority-based scheduling | The scheduler preferentially guarantees the running of high-priority pods, but will not evict low-priority pods that are running. Priority-based scheduling is enabled by default and cannot be disabled. | kube-scheduler or Volcano scheduler |
Configuring Priority-based Scheduling Policies¶
Log in to the CCE console.
Click the cluster name to access the cluster console. Choose Settings in the navigation pane. In the right pane, click the Scheduling tab.
In the Business priority scheduling area, configure priority-based scheduling.
Scheduling based on priority: The scheduler preferentially guarantees the running of high-priority pods, but will not evict low-priority pods that are running. Priority-based scheduling is enabled by default and cannot be disabled.
After the configuration, you can use PriorityClasses to schedule the pods of workloads or Volcano jobs based priorities.
Create one or more PriorityClasses.
apiVersion: scheduling.k8s.io/v1 kind: PriorityClass metadata: name: high-priority value: 1000000 globalDefault: false description: ""
Create a workload or Volcano job and specify its PriorityClass name.
Workload
apiVersion: apps/v1 kind: Deployment metadata: name: high-test labels: app: high-test spec: replicas: 5 selector: matchLabels: app: test template: metadata: labels: app: test spec: priorityClassName: high-priority schedulerName: volcano containers: - name: test image: busybox imagePullPolicy: IfNotPresent command: ['sh', '-c', 'echo "Hello, Kubernetes!" && sleep 3600'] resources: requests: cpu: 500m limits: cpu: 500m
Volcano job
apiVersion: batch.volcano.sh/v1alpha1 kind: Job metadata: name: vcjob spec: schedulerName: volcano minAvailable: 4 priorityClassName: high-priority tasks: - replicas: 4 name: "test" template: spec: containers: - image: alpine command: ["/bin/sh", "-c", "sleep 1000"] imagePullPolicy: IfNotPresent name: running resources: requests: cpu: "1" restartPolicy: OnFailure
Example of Priority-based Scheduling¶
For example, there are two idle nodes and several workloads with three priorities (high-priority, medium-priority, and low-priority). Run the high-priority workload to exhaust all cluster resources, and issue the medium-priority and low-priority workloads. Then, the two types of workloads are pending due to insufficient resources. When the high-priority workload ends, the pods of the medium-priority workload will be scheduled ahead of the pods of the low-priority workload according to the priority-based scheduling setting.
Add three PriorityClasses (high-priority, med-priority, and low-priority) in priority.yaml.
Example configuration of priority.yaml:
apiVersion: scheduling.k8s.io/v1 kind: PriorityClass metadata: name: high-priority value: 100 globalDefault: false description: "This priority class should be used for volcano job only." --- apiVersion: scheduling.k8s.io/v1 kind: PriorityClass metadata: name: med-priority value: 50 globalDefault: false description: "This priority class should be used for volcano job only." --- apiVersion: scheduling.k8s.io/v1 kind: PriorityClass metadata: name: low-priority value: 10 globalDefault: false description: "This priority class should be used for volcano job only."
Create PriorityClasses.
kubectl apply -f priority.yaml
Check PriorityClasses.
kubectl get PriorityClass
Command output:
NAME VALUE GLOBAL-DEFAULT AGE high-priority 100 false 97s low-priority 10 false 97s med-priority 50 false 97s system-cluster-critical 2000000000 false 6d6h system-node-critical 2000001000 false 6d6h
Create a high-priority workload named high-priority-job to exhaust all cluster resources.
high-priority-job.yaml
apiVersion: batch.volcano.sh/v1alpha1 kind: Job metadata: name: priority-high spec: schedulerName: volcano minAvailable: 4 priorityClassName: high-priority tasks: - replicas: 4 name: "test" template: spec: containers: - image: alpine command: ["/bin/sh", "-c", "sleep 1000"] imagePullPolicy: IfNotPresent name: running resources: requests: cpu: "1" restartPolicy: OnFailure
Run the following command to issue the job:
kubectl apply -f high_priority_job.yaml
Run the kubectl get pod command to check pod statuses:
NAME READY STATUS RESTARTS AGE priority-high-test-0 1/1 Running 0 3s priority-high-test-1 1/1 Running 0 3s priority-high-test-2 1/1 Running 0 3s priority-high-test-3 1/1 Running 0 3s
The command output shows that all cluster resources have been used up.
Create a medium-priority workload med-priority-job and a low-priority workload low-priority-job.
med-priority-job.yaml
apiVersion: batch.volcano.sh/v1alpha1 kind: Job metadata: name: priority-medium spec: schedulerName: volcano minAvailable: 4 priorityClassName: med-priority tasks: - replicas: 4 name: "test" template: spec: containers: - image: alpine command: ["/bin/sh", "-c", "sleep 1000"] imagePullPolicy: IfNotPresent name: running resources: requests: cpu: "1" restartPolicy: OnFailure
low-priority-job.yaml
apiVersion: batch.volcano.sh/v1alpha1 kind: Job metadata: name: priority-low spec: schedulerName: volcano minAvailable: 4 priorityClassName: low-priority tasks: - replicas: 4 name: "test" template: spec: containers: - image: alpine command: ["/bin/sh", "-c", "sleep 1000"] imagePullPolicy: IfNotPresent name: running resources: requests: cpu: "1" restartPolicy: OnFailure
Run the following commands to issue the jobs:
kubectl apply -f med_priority_job.yaml kubectl apply -f low_priority_job.yaml
Run the kubectl get pod command to check the statuses of the pods for the newly created workloads. The command output shows that the pods are pending due to insufficient resources:
NAME READY STATUS RESTARTS AGE priority-high-test-0 1/1 Running 0 3m29s priority-high-test-1 1/1 Running 0 3m29s priority-high-test-2 1/1 Running 0 3m29s priority-high-test-3 1/1 Running 0 3m29s priority-low-test-0 0/1 Pending 0 2m26s priority-low-test-1 0/1 Pending 0 2m26s priority-low-test-2 0/1 Pending 0 2m26s priority-low-test-3 0/1 Pending 0 2m26s priority-medium-test-0 0/1 Pending 0 2m36s priority-medium-test-1 0/1 Pending 0 2m36s priority-medium-test-2 0/1 Pending 0 2m36s priority-medium-test-3 0/1 Pending 0 2m36s
Delete the high_priority_job workload to release resources and check whether the pods of the med-priority-job workload will be preferentially scheduled.
Run the kubectl delete -f high_priority_job.yaml command to release cluster resources and check pod scheduling.
NAME READY STATUS RESTARTS AGE priority-low-test-0 0/1 Pending 0 5m18s priority-low-test-1 0/1 Pending 0 5m18s priority-low-test-2 0/1 Pending 0 5m18s priority-low-test-3 0/1 Pending 0 5m18s priority-medium-test-0 1/1 Running 0 5m28s priority-medium-test-1 1/1 Running 0 5m28s priority-medium-test-2 1/1 Running 0 5m28s priority-medium-test-3 1/1 Running 0 5m28s