From HTTP

When the source link of a job is the HTTP link, configure the source job parameters based on Table 1. Currently, data can only be exported from the HTTP URLs.

Table 1 Parameter description

Parameter

Description

Example Value

File URL

Use the GET method to obtain data from the HTTP/HTTPS URL.

These connectors are used to read files with an HTTP/HTTPS URL, such as reading public files on the third-party object storage system and web disks.

Pull List File

If this parameter is set to Yes, the system pulls the files corresponding to the URLs in the text file to be uploaded and stores them on OBS. The text file records the file paths on HDFS.

Yes

OBS Link of List File

Select an existing OBS link.

obs_link

OBS Bucket of entries files

Name of the OBS bucket that stores the text file

obs-cdm

Path/Directory of entries files

Custom OBS directories that store the text file. Use slashes (/) to separate different directories.

test1

File Format

CDM supports Binary only, which indicates that files (even not in binary format) will be directly transferred.

Binary

Compression Format

Compression format of the source files. The options are as follows:

  • NONE: Files in all formats can be transferred.

  • GZIP: Only files in gzip format can be transferred.

  • ZIP: Only files in Zip format can be transferred.

  • TAR.GZ: Files in TAR.GZ format are transferred.

NONE

Compressed File Suffix

This parameter is displayed when Compression Format is not NONE.

This parameter specifies the extension of the files to be decompressed. The decompression operation is performed only when the file name extension is used in a batch of files. Otherwise, files are transferred in the original format. If you enter * or leave the parameter blank, all files are decompressed.

*

File Separator

File separator. When multiple files are transferred, CDM uses the file separator to identify files. The default value is |. This parameter is not displayed if Pull List File is set to Yes.

|

Query Parameter

  • If you set this parameter to Yes, the name of the objects uploaded to OBS does not include the query parameter.

  • If you set this parameter to No, the name of the objects uploaded to OBS includes the query parameter.

No

Encryption

If the source data is encrypted, CDM can decrypt the data before exporting it. Select whether to decrypt the source data and select a decryption algorithm. The options are as follows:

  • NONE: Export data without decrypting it.

  • AES-256-GCM: The AES 256-bit encryption algorithm is used to encrypt data. Currently, only the AES-256-GCM (NoPadding) encryption algorithm is supported. This parameter is used for encryption at the migration destination and decryption at the migration source.

AES-256-GCM

Disregard Non-existent Path or File

If this is set to Yes, the job can be successfully executed even if the source path does not exist.

No

DEK

This parameter is displayed only when Encryption is set to AES-256-GCM. The key consists of 64 hexadecimal numbers and must be the same as the DEK configured during encryption. If the decryption and encryption keys are inconsistent, the system does not report an exception, but the decrypted data is incorrect.

DD0AE00DFECD78BF051BCFDA25BD4E320DB0A7AC75A1F3FC3D3C56A457DCDC1B

IV

This parameter is displayed only when Encryption is set to AES-256-GCM. The initialization vector consists of 32 hexadecimal numbers and must be the same as the IV configured during encryption. If the initialization vectors are inconsistent, the system does not report an exception, but the decrypted data is incorrect.

5C91687BA886EDCD12ACBC3FF19A3C3F

MD5 File Extension

This parameter is used to check whether the files extracted by CDM are consistent with source files.

.md5