To HDFS¶
If the destination link of a job is one of them listed in Link to HDFS, configure the destination job parameters based on Table 1.
Parameter | Description | Example Value |
---|---|---|
Write Directory | HDFS directory to which data will be written. This parameter can be configured as a macro variable of date and time and a path name can contain multiple macro variables. When the macro variable of date and time works with a scheduled job, the incremental data can be synchronized periodically. | /user/output |
File Format | Format in which data is written. The options are as follows:
If data is migrated between file-related data sources, such as FTP, SFTP, HDFS, and OBS, the value of File Format must the same as the source file format. | CSV |
Duplicate File Processing Method | Files with the same name and size are identified as duplicate files. If there are duplicate files during data writing, the following methods are available:
| Stop job |
Compression Format | File compression format after data writing. The following compression formats are supported:
| Snappy |
Line Separator | Lind feed character in a file. By default, the system automatically identifies \n, \r, and \r\n. This parameter is not used when File Format is set to Binary. |
|
Field Delimiter | Field delimiter in the file. This parameter is not used when File Format is set to Binary. | , |
Use Quote Character | This parameter is displayed only when File Format is CSV. It is used when database tables are migrated to file systems. If you set this parameter to Yes and a field in the source data table contains a field delimiter or line separator, CDM uses double quotation marks (") as the quote character to quote the field content as a whole to prevent a field delimiter from dividing a field into two fields, or a line separator from dividing a field into different lines. For example, if the hello,world field in the database is quoted, it will be exported to the CSV file as a whole. | No |
Use First Row as Header | When a table is migrated to a CSV file, CDM does not migrate the heading line of the table by default. If you set this parameter to Yes, CDM writes the heading line of the table to the file. | No |
Write to Temporary File | Whether to write the binary file to a .tmp file first. After the migration is successful, run the rename or move command at the migration destination to restore the file. | No |
Job Success Marker File | Whether to generate a marker file with a custom name in the destination directory after a job is executed successfully. If you do not specify a file name, this function is disabled by default. | finish.txt |
Customize Hierarchical Directory | Users can customize the directory hierarchy of files. Example: [Table name]/[Year]/[Month]/[Day]/[Data file name]. csv |
|
Hierarchical Directory | Used to specify the directory level of a file, with time macro supported (the time format is yyyy/MM/dd). If this parameter is left blank, the directory does not have a hierarchical structure. Example: ${dateformat(yyyy/MM/dd, -1, DAY)} |
|
Encryption | This parameter is displayed only when File Format is set to Binary. Whether to encrypt the uploaded data. The options are as follows:
| AES-256-GCM |
DEK | This parameter is displayed only when Encryption is set to AES-256-GCM. The key consists of 64 hexadecimal numbers. Remember the key configured here because the decryption key must be the same as that configured here. If the encryption and decryption keys are inconsistent, the system does not report an exception, but the decrypted data is incorrect. | DD0AE00DFECD78BF051BCFDA25BD4E320DB0A7AC75A1F3FC3D3C56A457DCDC1B |
IV | This parameter is displayed only when Encryption is set to AES-256-GCM. The initialization vector consists of 32 hexadecimal numbers. Remember the initialization vector configured here because the initialization vector used for decryption must be the same as that configured here. If the initialization vectors are inconsistent, the system does not report an exception, but the decrypted data is incorrect. | 5C91687BA886EDCD12ACBC3FF19A3C3F |
Note
HDFS supports the UTF-8 encoding only. Retain the default value UTF-8.