Spark Input¶
Overview¶
The Spark Input operator converts specified columns in an SparkSQL table into input fields of the same quantity.
Input and Output¶
Input: SparkSQL table column
Output: fields
Parameters¶
Parameter | Description | Type | Mandatory | Default Value |
---|---|---|---|---|
Spark database | Name of a Spark SQL database | String | No | default |
Spark table name | Configures the SparkSQL table name. Only one SparkSQL table is supported. | String | Yes | None |
Partition filter | Configures the partition filter can export data of specific partitions. The parameter is null by default and data of the whole table can be exported. For example, to export data of a table whose partition field's locale value is CN or US, the input is as follows: locale = "CN" or locale = "US" | String | No |
|
Input fields of Spark | Configures the input information of SparkSQL
| map | Yes |
|
Data Processing Rule¶
If the SparkSQL table name does not exist, the job fails to be submitted.
If the configured column names are inconsistent with the SparkSQL table column names, the data cannot be read and the number of imported data records is 0.
If the field value does not match the actual type, the data in the line will become dirty data.
Example¶
Use the data export from Spark to SQL Server 2014 as an example.
In SQL Server 2014, run the following statement to create an empty table test_1 for storing SparkSQL data. Run the following statement:
create table test_1 (id int, name text, value text);
Configure the Spark Input operator to generate fields A, B, and C.
After the data connector is set, click Automatic Identification. The system will automatically read fields in the database and select required fields for adding. You only need to optimize or modify the fields manually based on service scenarios.
Note
Performing this operation will overwrite existing data in the table.
Use the Table Out operator to export A, B, and C to the test_1 table.
select * from test_1;