• MapReduce Service

mrs
  1. Help Center
  2. MapReduce Service
  3. User Guide
  4. Using MRS
  5. Using Loader
  6. Source Link Configurations of Loader Jobs

Source Link Configurations of Loader Jobs

Overview

When Loader jobs obtain data from different data sources, a link corresponding to a data source type needs to be selected and the link properties need to be configured.

obs-connector

Table 1 Data source link properties of obs-connector

Parameter

Description

Bucket Name

OBS bucket for storing source data

Input directory or file

Actual storage form of source data. It can be either all data files in a directory or single data file contained in the bucket.

File format

Loader supports the following file formats of data stored in OBS:

  • CSV_FILE: Specifies a text file. When the destination link is a database link, only the text file is supported.
  • BINARY_FILE: Specifies binary files excluding text files.

Line Separator

Identifier of each line end of source data

Field Separator

Identifier of each field end of source data

Encode type

Text encoding type of source data. It takes effect for text files only.

File split type

The following types are supported:
  • File: The number of files is assigned to a map task by the total number of files. The calculation formula is Total number of files/Extractors.
  • Size: A file size is assigned to a map task by the total file size. The calculation formula is Total file size/Extractors.

generic-jdbc-connector

Table 2 Data source link properties of generic-jdbc-connector

Parameter

Description

Schema name

Name of the database storing source data. You can query and select it on the interface.

Table name

Data table storing the source data. You can query and select it on the interface.

Partition column

If multiple columns need to be read, use this column to split the result and obtain data.

Where clause

Query statement used for accessing the database

ftp-connector or sftp-connector

Table 3 Data source link properties of ftp-connector or  sftp-connector

Parameter

Description

Input directory or file

Actual storage form of source data. It can be either all data files in a directory or single data file contained in the file server.

File format

Loader supports the following file formats of data stored in the file server:

  • CSV_FILE: Specifies a text file. When the destination link is a database link, only the text file is supported.
  • BINARY_FILE: Specifies binary files excluding text files.

Line Separator

Identifier of each line end of source data

NOTE:

When FTP or SFTP serves as a source link and File format is set to BINARY_FILE, the value of Line Separator in the advanced properties is invalid.

Field Separator

Identifier of each field end of source data

NOTE:

When FTP or SFTP serves as a source link and File format is set to BINARY_FILE, the value of Field Separator in the advanced properties is invalid.

Encode type

Text encoding type of source data. It takes effect for text files only.

File split type

The following types are supported:
  • File: The number of files is assigned to a map task by the total number of files. The calculation formula is Total number of files/Extractors.
  • Size: A file size is assigned to a map task by the total file size. The calculation formula is Total file size/Extractors.

hbase-connector

Table 4 Data source link properties of hbase-connector

Parameter

Description

Table name

HBase table storing source data

hdfs-connector

Table 5 Data source link properties of hdfs-connector

Parameter

Description

Input directory or file

Actual storage form of source data. It can be either all data files in a directory or single data file contained in HDFS.

File format

Loader supports the following file formats of data stored in HDFS:

  • CSV_FILE: Specifies a text file. When the destination link is a database link, only the text file is supported.
  • BINARY_FILE: Specifies binary files excluding text files.

Line Separator

Identifier of each line end of source data

NOTE:

When HDFS serves as a source link and File format is set to BINARY_FILE, the value of Line Separator in the advanced properties is invalid.

Field Separator

Identifier of each field end of source data

NOTE:

When HDFS serves as a source link and File format is set to BINARY_FILE, the value of Field Separator in the advanced properties is invalid.

File split type

The following types are supported:
  • File: The number of files is assigned to a map task by the total number of files. The calculation formula is Total number of files/Extractors.
  • Size: A file size is assigned to a map task by the total file size. The calculation formula is Total file size/Extractors.

hive-connector

Table 6 Data source link properties of hive-connector

Parameter

Description

Database

Name of the Hive database storing the data source. You can query and select it on the interface.

Table

Name of the Hive table storing the data source. You can query and select it on the interface.

voltdb-connector

Table 7 Data source link properties of voltdb-connector

Parameter

Description

Partition column

If multiple columns need to be read, use this column to split the result and obtain data.

Table

Name of the memory database table storing source data. You can query and select it on the interface.