Common Configuration Items of Batch SQL Jobs¶
This section describes the common configuration items of the SQL syntax for DLI batch jobs.
Item | Default Value | Description |
---|---|---|
spark.sql.files.maxRecordsPerFile | 0 | Maximum number of records to be written into a single file. If the value is zero or negative, there is no limit. |
spark.sql.autoBroadcastJoinThreshold | 209715200 | Maximum size of the table that displays all working nodes when a connection is executed. You can set this parameter to -1 to disable the display. Note Currently, only the configuration unit metastore table that runs the ANALYZE TABLE COMPUTE statistics noscan command and the file-based data source table that directly calculates statistics based on data files are supported. |
spark.sql.shuffle.partitions | 200 | Default number of partitions used to filter data for join or aggregation. |
spark.sql.dynamicPartitionOverwrite.enabled | false | Whether DLI overwrites the partitions where data will be written into during runtime. If you set this parameter to false, all partitions that meet the specified condition will be deleted before data overwrite starts. For example, if you set false and use INSERT OVERWRITE to write partition 2021-02 to a partitioned table that has the 2021-01 partition, this partition will be deleted. If you set this parameter to true, DLI does not delete partitions before overwrite starts. |
spark.sql.files.maxPartitionBytes | 134217728 | Maximum number of bytes to be packed into a single partition when a file is read. |
spark.sql.badRecordsPath |
| Path of bad records. |