section> Computing
  • Auto Scaling
  • Bare Metal Server
  • Dedicated Host
  • Elastic Cloud Server
  • FunctionGraph
  • Image Management Service
Network
  • Direct Connect
  • Domain Name Service
  • Elastic IP
  • Elastic Load Balancing
  • Enterprise Router
  • NAT Gateway
  • Private Link Access Service
  • Secure Mail Gateway
  • Virtual Private Cloud
  • Virtual Private Network
  • VPC Endpoint
Storage
  • Cloud Backup and Recovery
  • Cloud Server Backup Service
  • Elastic Volume Service
  • Object Storage Service
  • Scalable File Service
  • Storage Disaster Recovery Service
  • Volume Backup Service
Application
  • API Gateway (APIG)
  • Application Operations Management
  • Application Performance Management
  • Distributed Message Service (for Kafka)
  • Simple Message Notification
Data Analysis
  • Cloud Search Service
  • Data Lake Insight
  • Data Warehouse Service
  • DataArts Studio
  • MapReduce Service
  • ModelArts
  • Optical Character Recognition
Container
  • Application Service Mesh
  • Cloud Container Engine
  • Cloud Container Instance
  • Software Repository for Containers
Databases
  • Data Replication Service
  • Distributed Cache Service
  • Distributed Database Middleware
  • Document Database Service
  • GeminiDB
  • Relational Database Service
  • TaurusDB
Management & Deployment
  • Cloud Create
  • Cloud Eye
  • Cloud Trace Service
  • Config
  • Log Tank Service
  • Resource Formation Service
  • Tag Management Service
Security Services
  • Anti-DDoS
  • Cloud Firewall
  • Database Security Service
  • Dedicated Web Application Firewall
  • Host Security Service
  • Identity and Access Management
  • Key Management Service
  • Web Application Firewall
Other
  • Enterprise Dashboard
  • Marketplace
  • Price Calculator
  • Status Dashboard
APIs
  • REST API
  • API Usage Guidelines
  • Endpoints
Development and Automation
  • SDKs
  • Drivers and Tools
  • Terraform
  • Ansible
  • Cloud Create
Architecture Center
  • Best Practices
  • Blueprints
IaaSComputingAuto ScalingBare Metal ServerDedicated HostElastic Cloud ServerFunctionGraphImage Management ServiceNetworkDirect ConnectDomain Name ServiceElastic IPElastic Load BalancingEnterprise RouterNAT GatewayPrivate Link Access ServiceSecure Mail GatewayVirtual Private CloudVirtual Private NetworkVPC EndpointStorageCloud Backup and RecoveryCloud Server Backup ServiceElastic Volume ServiceObject Storage ServiceScalable File ServiceStorage Disaster Recovery ServiceVolume Backup ServicePaaSApplicationAPI Gateway (APIG)Application Operations ManagementApplication Performance ManagementDistributed Message Service (for Kafka)Simple Message NotificationData AnalysisCloud Search ServiceData Lake InsightData Warehouse ServiceDataArts StudioMapReduce ServiceModelArtsOptical Character RecognitionContainerApplication Service MeshCloud Container EngineCloud Container InstanceSoftware Repository for ContainersDatabasesData Replication ServiceDistributed Cache ServiceDistributed Database MiddlewareDocument Database ServiceGeminiDBRelational Database ServiceTaurusDBManagementManagement & DeploymentCloud CreateCloud EyeCloud Trace ServiceConfigLog Tank ServiceResource Formation ServiceTag Management ServiceSecuritySecurity ServicesAnti-DDoSCloud FirewallDatabase Security ServiceDedicated Web Application FirewallHost Security ServiceIdentity and Access ManagementKey Management ServiceWeb Application FirewallOtherOtherEnterprise DashboardMarketplacePrice CalculatorStatus Dashboard

Data Lake Insight

  • API Usage Guidelines
  • Overview
  • Getting Started
  • Permission-related APIs
  • Global Variable-related APIs
  • APIs Related to Enhanced Datasource Connections
  • APIs Related to Elastic Resource Pools
  • Queue-related APIs (Recommended)
  • SQL Job-related APIs
    • Submitting a SQL Job (Recommended)
    • Canceling a Job (Recommended)
    • Querying All Jobs
    • Previewing SQL Job Query Results
    • Exporting Query Results
    • Querying Job Status
    • Querying Job Details
    • Checking SQL Syntax
    • Querying the Job Execution Progress
  • Flink Job-related APIs
  • APIs Related to Flink Job Templates
  • Spark Job-related APIs
  • Permissions Policies and Supported Actions
  • Out-of-Date APIs
  • Public Parameters
  • Change History
  • API Reference
  • SQL Job-related APIs
  • Submitting a SQL Job (Recommended)

Submitting a SQL Job (Recommended)¶

Function¶

This API is used to submit jobs to a queue using SQL statements.

The job types support DDL, DCL, IMPORT, QUERY, and INSERT. The IMPORT function is the same as that described in Importing Data (Deprecated). The difference lies in the implementation method.

Additionally, you can use other APIs to query and manage jobs. For details, see the following sections:

  • Querying Job Status

  • Querying Job Details

  • Exporting Query Results

  • Querying All Jobs

  • Canceling a Job (Recommended)

Note

This API is synchronous if job_type in the response message is DCL.

URI¶

  • URI format

    POST /v1.0/{project_id}/jobs/submit-job

  • Parameter description

    Table 1 URI parameter¶

    Parameter

    Mandatory

    Type

    Description

    project_id

    Yes

    String

    Project ID, which is used for resource isolation. For details about how to obtain its value, see Obtaining a Project ID.

Request Parameters¶

Table 2 Request parameters¶

Parameter

Mandatory

Type

Description

sql

Yes

String

SQL statement that you want to execute.

currentdb

No

String

Database where the SQL statement is executed. This parameter does not need to be configured during database creation.

queue_name

No

String

Name of the queue to which a job to be submitted belongs. The name can contain only digits, letters, and underscores (_), but cannot contain only digits or start with an underscore (_).

conf

No

Array of strings

You can set the configuration parameters for the SQL job in the form of Key/Value. For details about the supported configuration items, see Table 3.

tags

No

Array of objects

Label of a job. For details, see Table 4.

engine_type

No

String

Type of the engine that executes jobs.

Table 3 Configuration parameters description¶

Parameter

Default Value

Description

spark.sql.files.maxRecordsPerFile

0

Maximum number of records to be written into a single file. If the value is zero or negative, there is no limit.

spark.sql.autoBroadcastJoinThreshold

209715200

Maximum size of the table that displays all working nodes when a connection is executed. You can set this parameter to -1 to disable the display.

Note

Currently, only the configuration unit metastore table that runs the ANALYZE TABLE COMPUTE statistics noscan command and the file-based data source table that directly calculates statistics based on data files are supported.

spark.sql.shuffle.partitions

200

Default number of partitions used to filter data for join or aggregation.

spark.sql.dynamicPartitionOverwrite.enabled

false

Whether DLI overwrites the partitions where data will be written into during runtime. If you set this parameter to false, all partitions that meet the specified condition will be deleted before data overwrite starts. For example, if you set false and use INSERT OVERWRITE to write partition 2021-02 to a partitioned table that has the 2021-01 partition, this partition will be deleted.

If you set this parameter to true, DLI does not delete partitions before overwrite starts.

spark.sql.files.maxPartitionBytes

134217728

Maximum number of bytes to be packed into a single partition when a file is read.

spark.sql.badRecordsPath

-

Path of bad records.

dli.sql.sqlasync.enabled

true

Indicates whether DDL and DCL statements are executed asynchronously. The value true indicates that asynchronous execution is enabled.

dli.sql.job.timeout

-

Sets the job running timeout interval. If the timeout interval expires, the job is canceled. Unit: second

spark.sql.legacy.correlated.scalar.query.enabled

false

  • If set to true:

    • When there is no duplicate data in a subquery, executing a correlated subquery does not require deduplication from the subquery's result.

    • If there is duplicate data in a subquery, executing a correlated subquery will result in an error. To resolve this, the subquery's result must be deduplicated using functions such as max() or min().

  • If set to false:

    Regardless of whether there is duplicate data in a subquery, executing a correlated subquery requires deduplicating the subquery's result using functions such as max() or min(). Otherwise, an error will occur.

spark.sql.optimizer.dynamicPartitionPruning.enabled

true

This parameter is used to control whether to enable dynamic partition pruning. Dynamic partition pruning can help reduce the amount of data that needs to be scanned and improve query performance when executing SQL queries.

  • When set to true, dynamic partition pruning is enabled. SQL automatically detects and deletes partitions that do not meet the WHERE clause conditions during query. This is useful for tables that have a large number of partitions.

  • If SQL queries contain a large number of nested left join operations and the table has a large number of dynamic partitions, a large number of memory resources may be consumed during data parsing. As a result, the memory of the driver node is insufficient and there are frequent Full GCs.

    To avoid such issues, you can disable dynamic partition pruning by setting this parameter to false.

    However, disabling this optimization may reduce query performance. Once disabled, Spark does not automatically prune the partitions that do not meet the requirements.

Table 4 tags parameters¶

Parameter

Mandatory

Type

Description

key

Yes

String

Tag key

Note

A tag key can contain a maximum of 128 characters. Only letters, numbers, spaces, and special characters (_.:+-@) are allowed, but the value cannot start or end with a space or start with _sys_.

value

Yes

String

Note

A tag value can contain a maximum of 255 characters. Only letters, numbers, spaces, and special characters (_.:+-@) are allowed.

Response Parameters¶

Table 5 Response parameters¶

Parameter

Mandatory

Type

Description

is_success

Yes

Boolean

Indicates whether the request is successfully sent. Value true indicates that the request is successfully sent.

message

Yes

String

System prompt. If execution succeeds, the parameter setting may be left blank.

job_id

Yes

String

ID of a job returned after a job is generated and submitted by using SQL statements. The job ID can be used to query the job status and results.

job_type

Yes

String

Job type. The options include:

  • DDL

  • DCL

  • IMPORT

  • EXPORT

  • QUERY

  • INSERT

schema

No

Array of Map

If the statement type is DDL, the column name and type of DDL are displayed.

rows

No

Array of objects

When the statement type is DDL and dli.sql.sqlasync.enabled is set to false, the execution results are returned directly. However, only a maximum of 1,000 rows can be returned.

If there are more than 1,000 rows, obtain the results asynchronously. That is, when submitting the job, set xxxx to true, and then obtain the results from the job bucket configured by DLI. The path of the results on the job bucket can be obtained from the result_path in the return value of the ShowSqlJobStatus API. The full data of the results will be automatically exported to the job bucket.

job_mode

No

String

Job execution mode. The options are as follows:

  • async: asynchronous

  • sync: synchronous

Example Request¶

Submit a SQL job. The job execution database and queue are db1 and default, respectively. Then, add the tags workspace=space1 and jobName=name1 for the job.

{
    "currentdb": "db1",
    "sql": "desc table1",
    "queue_name": "default",
    "conf": [
        "dli.sql.shuffle.partitions = 200"
    ],
    "tags": [
            {
              "key": "workspace",
              "value": "space1"
             },
            {
              "key": "jobName",
              "value": "name1"
             }
      ]
}

Example Response¶

{
  "is_success": true,
  "message": "",
  "job_id": "8ecb0777-9c70-4529-9935-29ea0946039c",
  "job_type": "DDL",
  "job_mode":"sync",
  "schema": [
    {
      "col_name": "string"
    },
    {
      "data_type": "string"
    },
    {
      "comment": "string"
    }
  ],
  "rows": [
    [
      "c1",
      "int",
      null
    ],
    [
      "c2",
      "string",
      null
    ]
  ]
}

Status Codes¶

Table 6 describes the status codes.

Table 6 Status codes¶

Status Code

Description

200

Submitted successfully.

400

Request error.

500

Internal service error.

Error Codes¶

If an error occurs when this API is invoked, the system does not return the result similar to the preceding example, but returns the error code and error information. For details, see Error Codes.

  • Prev
  • Next
last updated: 2025-06-16 14:07 UTC - commit: 2d6c283406071bb470705521bc41e86fa3400203
Edit pageReport Documentation Bug
Page Contents
  • Submitting a SQL Job (Recommended)
    • Function
    • URI
    • Request Parameters
    • Response Parameters
    • Example Request
    • Example Response
    • Status Codes
    • Error Codes
© T-Systems International GmbH
  • Contact
  • Data privacy
  • Disclaimer of liabilitys
  • Imprint