Using DLI to Submit a Spark Jar Job¶

Scenario¶

DLI allows you to submit Spark jobs compiled as JAR files, which contain the necessary code and dependency information for executing the job. These files are used for specific data processing tasks such as data query, analysis, and machine learning. Before submitting a Spark Jar job, upload the package to OBS and submit it along with the data and job parameters to run the job.

This example introduces the basic process of submitting a Spark Jar job package through the DLI console. Due to different service requirements, the specific writing of the Jar package may vary. It is recommended that you refer to the sample code provided by DLI and edit and customize it according to your actual business scenario.

Procedure¶

Table 1 describes the procedure for submitting a Spark Jar job using DLI.

**Table 1** Procedure for submitting a Spark Jar job using DLI¶
Step	Description
Step 1: Upload Data to OBS	Prepare a Spark Jar job package and upload it to OBS.
Step 2: Create an Elastic Resource Pool and Add Queues to the Pool	Create compute resources required for submitting the Spark Jar job.
Step 3: Submit a Spark Job	Create a Spark Jar job to analyze data.

Step 1: Upload Data to OBS¶

Develop a Spark Jar job program, compile it, and pack it into spark-examples.jar. Perform the following steps to upload the program:

Before submitting Spark Jar jobs, upload data files to OBS.

Log in to the DLI console.
In the service list, click Object Storage Service under Storage.
Create a bucket. In this example, name it dli-test-obs01.
1. On the displayed Buckets page, click Create Bucket in the upper right corner.
2. On the displayed Create Bucket page, enter the Bucket Name. Retain the default values for other parameters or set them as required.
  Note
  Select a region that matches the location of the DLI console.
3. Click Create Now.
In the bucket list, click the name of the dli-test-obs01 bucket you just created to access its Objects tab.
Click Upload Object. In the dialog box displayed, drag or add files or folders, for example, spark-examples.jar, to the upload area. Then, click Upload.
In this example, the path after upload is obs://dli-test-obs01/spark-examples.jar.
For more operations on the OBS console, see the Object Storage Service User Guide.

Step 2: Create an Elastic Resource Pool and Add Queues to the Pool¶

In this example, the elastic resource pool dli_resource_pool and queue dli_queue_01 are created.

Log in to the DLI management console.
In the navigation pane on the left, choose Resources > Resource Pool.
On the displayed page, click Buy Resource Pool in the upper right corner.

On the displayed page, set the parameters.

Table 2 describes the parameters.

**Table 2** Parameters¶
Parameter	Description	Example Value
Region	Select a region where you want to buy the elastic resource pool.	_
Project	Project uniquely preset by the system for each region	Default
Name	Name of the elastic resource pool	dli_resource_pool
Specifications	Specifications of the elastic resource pool	Standard
CU Range	The maximum and minimum CUs allowed for the elastic resource pool	64-64
CIDR Block	CIDR block the elastic resource pool belongs to. If you use an enhanced datasource connection, this CIDR block cannot overlap that of the data source. Once set, this CIDR block cannot be changed.	172.16.0.0/19
Enterprise Project	Select an enterprise project for the elastic resource pool.	default

Click Buy.
Click Submit.
In the elastic resource pool list, locate the pool you just created and click Add Queue in the Operation column.

Set the basic parameters listed below.

**Table 3** Basic parameters for adding a queue¶
Parameter	Description	Example Value
Name	Name of the queue to add	dli_queue_01
Type	Type of the queue To execute SQL jobs, select For SQL. To execute Flink or Spark jobs, select For general purpose.	_
Enterprise Project	Select an enterprise project.	default

Click Next and configure scaling policies for the queue.

Click Create to add a scaling policy with varying priority, period, minimum CUs, and maximum CUs.

**Table 4** Scaling policy parameters¶
Parameter	Description	Example Value
Priority	Priority of the scaling policy in the current elastic resource pool. A larger value indicates a higher priority. In this example, only one scaling policy is configured, so its priority is set to 1 by default.	1
Period	The first scaling policy is the default policy, and its Period parameter configuration cannot be deleted or modified. The period for the scaling policy is from 00 to 24.	00-24
Min CU	Minimum number of CUs allowed by the scaling policy	16
Max CU	Maximum number of CUs allowed by the scaling policy	64

Click OK.

Step 3: Submit a Spark Job¶

On the DLI management console, choose Job Management > Spark Jobs in the navigation pane on the left. On the displayed page, click Create Job in the upper right corner.
Set the following Spark job parameters:
- Queue: Select the queue created in Step 2: Create an Elastic Resource Pool and Add Queues to the Pool.
- Spark Version: Select a Spark engine version.
- Application: Select the package created in Step 1: Upload Data to OBS.
For other parameters, refer to the description about the Spark job editing page in "Creating a Spark Job" in the Data Lake Insight User Guide.
Click Execute in the upper right corner of the Spark job editing window, read and agree to the privacy agreement, and click OK. Submit the job. A message is displayed, indicating that the job is submitted.
(Optional) Switch to the Job Management > Spark Jobs page to view the status and logs of the submitted Spark job.

last updated: 2025-09-26 14:01 UTC - commit: b678b8de99e440a05c547c994e0d5173f2a868c9