Common Sqoop Commands and Parameters¶
Common Sqoop commands¶
Command | Description |
---|---|
import | Imports data to a cluster. |
export | Exports data of a cluster. |
codegen | Obtains data from a table in the database to generate a Java file and compress the file. |
create-hive-table | Creates a Hive table. |
eval | Executes a SQL statement and view the result. |
import-all-tables | Imports all tables in a database to HDFS. |
job | Generates a Sqoop job. |
list-databases | Lists database names. |
list-tables | List table names. |
merge | Merges data in different HDFS directories and saves the data to a specified directory. |
metastore | Starts the metadata database to record the metadata of a Sqoop job. |
help | Prints help information. |
version | Prints the version information. |
Common Parameters¶
Category | Parameter | Description |
---|---|---|
Parameters for database connection | --connect | Specifies the URL for connecting to a relational database. |
--connection-manager | Specifies the connection manager class. | |
--driver jdbc | Specifies the driver package for database connection. | |
--help | Prints help information. | |
--password | Specifies the password for connecting to a database. | |
--username | Specifies the username for connecting to a database. | |
--verbose | Prints detailed information on the console. | |
import parameters | --fields-terminated-by | Specifies the field delimiter, which must be the same as that in a Hive table or HDFS file. |
--lines-terminated-by | Specifies the line delimiter, which must be the same as that in a Hive table or HDFS file. | |
--mysql-delimiters | Specifies the default delimiter settings of MySQL. | |
export parameters | --input-fields-terminated-by | Specifies the field delimiter. |
--input-lines-terminated-by | Specifies the line delimiter. | |
Hive parameters | --hive-delims-replacement | Replaces characters such as \r and \n in data with user-defined characters. |
--hive-drop-import-delims | Removes characters such as \r and \n when data is imported to Hive. | |
--map-column-hive | Specifies the data type of fields during the generation of a Hive table. | |
--hive-partition-key | Creates a partition. | |
--hive-partition-value | Imports data to a specified partition of a database. | |
--hive-home | Specifies the installation directory for Hive. | |
--hive-import | Specifies that data is imported from a relational database to Hive. | |
--hive-overwrite | Overwrites existing Hive data. | |
--create-hive-table | Creates a Hive table. The default value is false. A destination table will be created if it does not exist. | |
--hive-table | Specifies a Hive table to which data is to be imported. | |
--table | Specifies the relational database table. | |
--columns | Specifies the fields of a relational data table to be imported. | |
--query | Specifies the query statement for importing the query result. | |
HCatalog parameters | --hcatalog-database | Specifies a Hive database and imports data to it using HCatalog. |
--hcatalog-table | Specifies a Hive table and imports data to it using HCatalog. | |
Others | -m or --num-mappers | Specifies the number of map tasks used by a Sqoop job. |
--split-by | Specifies the column based on which Sqoop splits work units. This parameter is used together with -m. | |
--target-dir | Specifies the temporary directory of HDFS. | |
--null-string string | Specifies the string to be written for a null value for string columns. | |
--null-non-string | Specifies the string to be written for a null value for non-string columns. | |
--check-column | Specifies the column for determining incremental data import. | |
--incremental append or lastmodified | Incrementally imports data. append: appends records, for example, appending records that are greater than the value specified by last-value. lastmodified: appends data that is modified after the date specified by last-value. | |
--last-value | Specifies the last value of the check column from the previous import. | |
--input-null-string | Specifies the string to be interpreted as NULL for string columns. | |
--input-null-non-string | Specifies the string to be interpreted as null for non-string columns. If this parameter is not specified, NULL will be used. |