Supported Data Sources¶
CDM provides the following migration modes which support different data sources:
Table/File migration in the import of data into a data lake or migration of data to the cloud. For details, see Data Sources Supported by Table/File Migration.
Entire DB migration in the import of data into a data lake or migration of data to the cloud. For details, see Supported Data Sources in Entire DB Migration.
In addition, this section provides the data types supported in database migration. For details, see Data Types Supported in Open-Source MySQL Database Migration, Data Types Supported in Oracle Database Migration, and Data Types Supported in SQL Server Database Migration.
Data Sources Supported by Table/File Migration¶
Table/File migration can migrate data in tables or files.
Table 1 describes the supported data sources.
Category | Data Source | Read | Write | Data Source That Can Be Written During Data Read | Description |
---|---|---|---|---|---|
Data warehouse | GaussDB(DWS) | Supported | Supported |
| The DWS physical machine management mode is not supported. |
Data Lake Insight (DLI) | Supported | Supported |
| ||
Hadoop | MRS HDFS | Supported | Supported |
|
|
MRS HBase | Supported | Supported | |||
MRS Hive | Supported | Supported | |||
FusionInsight HDFS | Supported | Not supported |
|
| |
FusionInsight HBase | Supported | Not supported | |||
FusionInsight Hive | Supported | Not supported | |||
Apache HBase | Supported | Not supported |
|
| |
Apache Hive | Supported | Not supported | |||
Apache HDFS | Supported | Not supported | |||
Object storage | Object Storage Service (OBS) | Supported | Supported |
| Object Storage Migration Service (OMS) is recommended for migration between object storage services. |
File system | FTP | Supported | Not supported |
| obsutil is recommended for migrating data from file systems to OBS. For details, see . |
SFTP | Supported | Not supported | |||
HTTP | Supported | Not supported | Hadoop: MRS HDFS |
| |
Relational database | RDS for MySQL | Supported | Supported |
|
|
RDS for PostgreSQL | Supported | Supported | |||
RDS for SQL Server | Supported | Supported | |||
MySQL | Supported | Supported |
| ||
PostgreSQL | Supported | Supported | |||
Microsoft SQL Server | Supported | Supported | |||
Oracle | Supported | Supported | |||
SAP HANA | Supported | Not supported |
| SAP HANA data sources have the following restrictions:
| |
Database sharding | Supported | Not supported |
|
| |
NoSQL | Distributed Cache Service (DCS) | Supported | Not supported | Hadoop: MRS HDFS, MRS HBase, and MRS Hive | |
Redis | Supported | Not supported | |||
Document Database Service (DDS) | Supported | Not supported |
| ||
MongoDB | Supported | Not supported |
| ||
CloudTable | Supported | Supported |
|
| |
Cassandra | Supported | Not supported |
|
| |
Message system | Apache Kafka | Supported | Not supported | Search: Cloud Search Service (CSS) |
|
DMS Kafka | Supported | Not supported | |||
MRS Kafka | Supported | Not supported |
|
| |
Search | Elasticsearch | Supported | Supported |
| Only the non-security mode is supported. |
Cloud Search Service (CSS) | Supported | Supported | You are advised to use Logstash to import data to CSS. |
Note
In the preceding table, the non-cloud data sources, such as MySQL, include on-premises MySQL, MySQL built on ECSs, or MySQL on the third-party cloud.
Supported Data Sources in Entire DB Migration¶
Entire DB migration is used when an on-premises data center or a database created on an ECS needs to be synchronized to a database service or big data service on the cloud. It is suitable for offline database migration but not online real-time migration.
Table 2 lists the data sources supporting entire DB migration using CDM.
Category | Data Source | Read | Write | Description |
---|---|---|---|---|
Data warehouse | Data Warehouse Service (DWS) | Supported | Supported |
|
FusionInsight LibrA | Supported | Not supported |
| |
Hadoop (available only for local storage, and not for storage-compute decoupling, Ranger, or ZooKeeper for which SSL is enabled) | MRS HBase | Supported | Supported | Entire DB migration only to MRS HBase Recommended versions:
|
MRS Hive | Supported | Supported | Entire DB migration only to a relational database 2.x versions are not supported. The following versions are recommended:
| |
FusionInsight HBase | Supported | Not supported | Recommended versions:
| |
FusionInsight Hive | Supported | Not supported | Entire DB migration only to a relational database 2.x versions are not supported. The following versions are recommended:
| |
Apache HBase | Supported | Not supported | Recommended versions:
| |
Apache Hive | Supported | Not supported | Entire DB migration only to a relational database 2.x versions are not supported. The following versions are recommended:
| |
Relational database | RDS for MySQL | Supported | Supported | Migration from OLTP to OLTP is not supported. In this scenario, you are advised to use the Data Replication Service (DRS). |
RDS for PostgreSQL | Supported | Supported | ||
RDS for SQL Server | Supported | Supported | ||
MySQL | Supported | Not supported | ||
PostgreSQL | Supported | Not supported | ||
Microsoft SQL Server | Supported | Not supported | ||
Oracle | Supported | Not supported | ||
IBM Db2 | Supported | Not supported | ||
SAP HANA | Supported | Not supported |
| |
MyCAT | Supported | Not supported |
| |
Dameng database | Supported | Not supported | Only to DWS and Hive | |
NoSQL | Distributed Cache Service (DCS) | Not supported | Supported | Only migration from MRS to DCS is supported. |
Document Database Service (DDS) | Supported | Supported | Only migration between DDS and MRS is supported. | |
CloudTable Service (CloudTable) | Supported | Supported |
|
Data Types Supported in Open-Source MySQL Database Migration¶
When the source end is an open-source MySQL database and the destination end is a Hive or DWS database, the following data types are supported:
Category | Type | Description | Storage Format Example | Hive | DWS |
---|---|---|---|---|---|
Character string | CHAR(M) | A fixed-length string of 1 to 255 characters, for example, CHAR(5). The length limit is not mandatory. It is set to 1 by default. | 'a' or 'aaaaa' | CHAR | CHAR |
VARCHAR(M) | A variable-length string consists of 1 to 255 characters (more than 255 characters for MySQL of a later version). Example: VARCHAR(25). When creating a field of the VARCHAR type, you must define the length. | 'a' or 'aaaaa' | VARCHAR | VARCHAR | |
Value | DECIMAL(M,D) | Uncompressed floating-point numbers cannot be unsigned. In unpacking decimals, each decimal corresponds to a byte. Defining the number of display lengths (M) and decimals (D) is required. NUMERIC is the synonym of DECIMAL. | 52.36 | DECIMAL | When D is 0, it corresponds to BIGINT. When D is not 0, it corresponds to NUMBERIC. |
NUMBERIC | Same as DECIMAL |
| DECIMAL | NUMBERIC | |
INTEGER | An integer of normal size that can be signed. If the value is signed, it ranges from -2147483648 to 2147483647. If the value is unsigned, the value ranges from 0 to 4294967295. Up to 11-bit width can be specified. | 5236 | INT | INTEGER | |
INTEGER UNSIGNED | Unsigned form of INTEGER |
| BIGINT | INTEGER | |
INT | Same as INTEGER | 5236 | INT | INTEGER | |
INT UNSIGNED | Same as INTEGER UNSIGNED |
| BIGINT | INTEGER | |
BIGINT | A large integer that can be signed. If the value is signed, it ranges from -9223372036854775808 to 9223372036854775807. If the value is unsigned, the value ranges from 0 to 18446744073709551615. Up to 20-bit width can be specified. | 5236 | BIGINT | BIGINT | |
BIGINT UNSIGNED | Unsigned form of BIGINT |
| BIGINT | BIGINT | |
MEDIUMINT | A medium-sized integer that can be signed. If the value is signed, it ranges from -8388608 to 8388607. If the value is unsigned, it ranges from 0 to 16777215, and you can specify a maximum of 9-bit width. | -128, 127 | INT | INTEGER | |
MEDIUMINT UNSIGNED | Unsigned form of MEDIUMINT |
| BIGINT | INTEGER | |
TINYINT | A very small integer that can be signed. If signed, the value ranges from -128 to 127. If unsigned, the value ranges from 0 to 255, and you can specify a maximum of 4-bit width. | 100 | TINYINT | SMALLINT | |
TINYINT UNSIGNED | Unsigned form of TINYINT |
| TINYINT | SMALLINT | |
BOOL | The bool of MySQL is tinyint(1). | -128, 127 | SMALLINT | BYTEA | |
SMALLINT | A small integer that can be signed. If the value is signed, it ranges from -32768 to 32767. If unsigned, the value ranges from 0 to 65535, and you can specify a maximum of 5-bit width. | 9999 | SMALLINT | SMALLINT | |
SMALLINT UNSIGNED | Unsigned form of SMALLINT |
| INT | SMALLINT | |
REAL | Same as DOUBLE |
| DOUBLE |
| |
FLOAT(M,D) | Unsigned floating-point numbers cannot be used. The display length (M) and number of decimal places (D) can be specified. This is not mandatory, and the default value is 10,2. In the preceding information, 2 indicates the number of decimal places and 10 indicates the total number of digits (including decimal places). The decimal precision can reach 24 floating points. | 52.36 | FLOAT | FLOAT4 | |
DOUBLE(M,D) | Unsigned double-precision floating-point numbers cannot be used. The display length (M) and number of decimal places (D) can be specified. This is not mandatory. The default value is 16,4, where 4 is the number of decimal places. The decimal precision can reach 53-digit. REAL is a synonym of DOUBLE. | 52.36 | DOUBLE | FLOAT8 | |
DOUBLE PRECISION | Similar to DOUBLE | 52.3 | DOUBLE | FLOAT8 | |
Bit | BIT(M) | Stored bit type value. BIT(M) can store up to M bits of values, and M ranges from 1 to 64. | B'1111100' B'1100' | TINYINT | BYTEA |
Time and date | DATE | The value is in the YYYY-MM-DD format and ranges from 1000-01-01 to 9999-12-31. For example, December 30, 1973 will be stored as 1973-12-30. | 1999-10-01 | DATE | TIMESTAMP |
TIME | Stores information about the hour, minute, and second. | '09:10:21' or '9:10:21' | Not supported (string) | TIME | |
DATETIME | The date and time are in the YYYY-MM-DD HH:MM:SS format and range from 1000-01-01 00:00:00 to 9999-12-31 23:59:59. For example, 3:30 p.m. on December 30, 1973 will be stored as 1973-12-30 15:30:00. | '1973-12-30 15:30:00' | TIMESTAMP | TIMESTAMP | |
TIMESTAMP | Timestamp type. Timestamp between midnight on January 1, 1970 and a time point in 2037. Similar to the DATETIME format (YYYYMMDDHHMMSS), except that no hyphen is required. For example, 3:30 p.m. December 30, 1973 will be stored as 19731230153000. | 19731230153000 | TIMESTAMP | TIMESTAMP | |
YEAR(M) | The year is stored in 2-digit or 4-digit number format. If the length is specified as 2 (for example, YEAR(2)), the year ranges from 1970 to 2069 (70 to 69). If the length is specified as 4, the year ranges from 1901 to 2155. The default length is 4. | 2000 | Not supported (string) | Not supported | |
Multimedia (binary) | BINARY(M) | The number of bytes is M. The length of a variable-length binary string ranges from 0 to M. M is the value length plus 1. | 0x2A3B4058 (binary data) | Not supported | BYTEA |
VARBINARY(M) | The number of bytes is M. A fixed binary string with a length of 0 to M. | 0x2A3B4059 (binary data) | Not supported | BYTEA | |
TEXT | The maximum length of the field is 65535 characters. TEXT is a "binary large object" and is used to store large binary data, such as images or other types of files. | 0x5236 (binary data) | Not supported | Not supported | |
TINYTEXT | A binary string of 0 to 255 bytes in short text |
|
| Not supported | |
MEDIUMTEXT | A binary string of 0 to 167772154 bytes in medium-length text |
|
| Not supported | |
LONGTEXT | A binary string of 0 to 4294967295 bytes in large-length text |
|
| Not supported | |
BLOB | The maximum length of the field is 65535 characters. BLOB is a "binary large object" and is used to store large binary data, such as images or other types of files. BLOB is case-sensitive. | 0x5236 (binary data) | Not supported | BYTEA | |
TINYBLOB | A binary string of 0 to 255 bytes in short text |
|
| BYTEA | |
MEDIUMBLOB | A binary string of 0 to 167772154 bytes in medium-length text |
|
| BYTEA | |
LONGBLOB | A binary string of 0 to 4294967295 bytes in large-length text | 0x5236 (binary data) | Not supported | BYTEA | |
Special type | SET | SET is a string object that can have no or multiple values. The values come from the allowed column of values specified when the table is created. When specifying the SET column values that contain multiple SET members, separate the members with commas (,). The SET member value cannot contain commas (,). |
|
| Not supported |
JSON |
|
| Not supported | Not supported (TEXT) | |
ENUM | When an ENUM is defined, a list of its values is created, which are the items that must be used for selection (or NULL). For example, if you want a field to contain "A", "B", or "C", you can define an ENUM ("A", "B", or "C"). Only these values (or NULL) can be used to fill in the field. |
| Not supported | Not supported |
Data Types Supported in Oracle Database Migration¶
When the source end is an Oracle database and the destination end is a Hive or DWS database, the following data sources are supported:
Category | Type | Description | Hive | DWS |
---|---|---|---|---|
Character string | char | Fixed-length character string, which is padded with spaces to reach the maximum length. | CHAR | CHAR |
nchar | Fixed-length character string contains data in Unicode format. | CHAR | CHAR | |
varchar2 | Synonym of VARCHAR. It is a variable-length string, unlike the CHAR type, which does not pad the field or variable to reach its maximum length with spaces. | VARCHAR | VARCHAR | |
nvarchar2 | Variable-length character string contains data in Unicode format. | VARCHAR | VARCHAR | |
Value | number | Stores numbers with a precision of up to 38 digits. | DECIMAL | NUMERIC |
binary_float | 2-bit single-precision floating point number | FLOAT | FLOAT8 | |
binary_double | 64-bit double-precision floating point number | DOUBLE | FLOAT8 | |
long | A maximum of 2 GB character data can be stored. | Not supported | Not supported | |
Time and date | date | 7-byte date/time data type, including seven attributes: century, year in the century, month, day in the month, hour, minute, and second. | DATE | TIMESTAMP |
timestamp | 7-byte or 11-byte fixed-width date/time data type that contains decimals (seconds) | TIMESTAMP | TIMESTAMP | |
timestamp with time zone | 3-byte timestamp, which supports the time zone. | TIMESTAMP | TIME WITH TIME ZONE | |
timestamp with local time zone | 7-byte or 11-byte fixed-width date/time data type. Time zone conversion occurs when data is inserted or read. | TIMESTAMP | Not supported (TEXT) | |
interval year to month | 5-byte fixed-width data type, which is used to store a time segment. | Not supported | Not supported (TEXT) | |
interval day to second | 11-byte fixed-width data type, which is used to store a time segment. The time segment is stored in days/hours/minutes/seconds. The value can also contain nine decimal places (seconds). | Not supported | Not supported (TEXT) | |
Multimedia (binary) | raw | A variable-length binary data type. Character set conversion is not performed for data stored in this data type. | Not supported | Not supported |
long raw | Stores up to 2 GB binary information. | Not supported | Not supported | |
blob | A maximum of 4 GB data can be stored. | Not supported | Not supported | |
clob | In Oracle 10g and later versions, a maximum of (4 GB) x (database block size) bytes of data can be stored. CLOB contains the information for which character set conversion is to be performed. This data type is ideal for storing plain text information. | Not supported | Not supported | |
nclob | This type can store a maximum of 4 GB data. When the character set is converted, this type is affected. | Not supported | Not supported | |
bfile | An Oracle directory object and a file name can be stored in the database column, and the file can be read through the Oracle directory object and file name. | Not supported | Not supported | |
Others | rowid | In fact, it is the address of a row in the database table. It is 10 bytes long. | Not supported | Not supported |
urowid | It is a common row ID and does not have a fixed rowid table. | Not supported | Not supported |
Data Types Supported in SQL Server Database Migration¶
When the source end is a SQL Server database and the destination end is a Hive, Oracle or DWS database, the following data sources are supported:
Category | Type | Description | Hive | DWS | Oracle |
---|---|---|---|---|---|
String data type | char | Fixed-length character string, which is padded with spaces to reach the maximum length. | CHAR | CHAR | CHAR |
nchar | Fixed-length character string contains data in Unicode format. | CHAR | CHAR | CHAR | |
varchar | A variable-length string consists of 1 to 255 characters (more than 255 characters for MySQL of a later version). Example: VARCHAR(25). When creating a field of the VARCHAR type, you must define the length. | VARCHAR | VARCHAR | VARCHAR | |
nvarchar | Stores variable-length Unicode character data, similar to varchar. | VARCHAR | VARCHAR | VARCHAR | |
Numeric data type | int | int is stored in four bytes, where one binary bit represents a sign bit, and the other 31 binary bits represent a length and a size, and may represent all integers ranging from -231 to 231 - 1. | INT | INTEGER | INT |
bigint | bigint is stored in eight bytes, where one binary bit represents a sign bit, and the other 63 binary bits represent a length and a size, and may represent all integers ranging from -263 to 263 - 1. | BIGINT | BIGINT | NUMBER | |
smallint | Data of the smallint type occupies two bytes of storage space. One binary bit indicates a positive or negative sign of an integer value, and the other 15 binary bits indicate a length and a size, and may represent all integers ranging from -215 to 215. | SMALLINT | SMALLINT | NUMBER | |
tinyint | Tinyint data occupies one byte of storage space and can represent all integers ranging from 0 to 255. | TINYINT | TINYINT | NUMBER | |
real | The value can be a positive or negative decimal number. | DOUBLE | FLOAT4 | NUMBER | |
float | The number of digits (in scientific notation) of the mantissa of a float value, which determines the precision and storage size | FLOAT | FLOAT8 | binary_float | |
decimal | Numeric data type with fixed precision and scale | DECIMAL | NUMERIC | NUMBER | |
numeric | Stores zero, positive, and negative fixed point numbers. | DECIMAL | NUMERIC | NUMBER | |
Date and time data type | date | Stores date data represented by strings. | DATE | TIMESTAMP | DATE |
time | Time of a day, which is recorded in the form of a character string. | Not supported (string) | TIME | Not supported | |
datetime | Stores time and date data. | TIMESTAMP | TIMESTAMP | Not supported | |
datetime2 | Extended type of datetime, which has a larger data range. By default, the minimum precision is the highest, and the user-defined precision is optional. | TIMESTAMP | TIMESTAMP | Not supported | |
smalldatetime | The smalldatetime type is similar to the datetime type. The difference is that the smalldatetime type stores data from January 1, 1900 to June 6, 2079. When the date and time precision is low, the smalldatetime type can be used. Data of this type occupies 4-byte storage space. | TIMESTAMP | TIMESTAMP | Not supported | |
timestamp | Timestamp data type | TIMESTAMP | TIMESTAMP | TIMESTAMP | |
datetimeoffset | A time that uses the 24-hour clock and combined with date and the time zone. | Not supported (string) | TIMESTAMP | Not supported | |
Multimedia data types (binary) | text | Stores text data. | Not supported (string) | Not supported (string) | Not supported |
netxt | The function of this type is the same as that of the text type. It is non-Unicode data with variable length. | Not supported (string) | Not supported (string) | Not supported | |
image | Variable-length binary data used to store pictures, catalog pictures, or paintings. | Not supported (string) | Not supported (string) | Not supported | |
binary | Binary data with a fixed length of n bytes, where n ranges from 1 to 8,000. | Not supported (string) | Not supported (string) | Not supported | |
varbinary | Variable-length binary data | Not supported (string) | Not supported (string) | Not supported | |
Currency data type | money | Stores currency values. | Not supported (string) | Not supported (string) | Not supported |
smallmoney | Similar to the money type, a currency symbol is prefixed to the input data. For example, the currency symbol of CNY is ¥. | Not supported (string) | Not supported (string) | Not supported | |
Data type | bit | Bit data type. The value is 0 or 1. The length is 1 byte. A bit value is often used as a logical value to determine whether it is true(1) or false(0). If a non-zero value is entered, the system replaces it with 1. | Not supported | Not supported | Not supported |
Other data types | rowversion | Each piece of data has a counter. The value of the counter increases when an insert or update operation is performed on a table that contains the rowversion column in the database. | Not supported | Not supported | Not supported |
uniqueidentifier | A 16-byte globally unique identifier (GUID) is a unique number generated by the SQL Server based on the network adapter address and host CPU clock. Each GUID is a hexadecimal number ranging from 0 to 9 or a to f. | Not supported | Not supported | Not supported | |
cursor | Cursor data type | Not supported | Not supported | Not supported | |
sql_variant | Stores any valid SQL Server data except the text, image, and timestamp data, which facilitates the development of the SQL Server. | Not supported | Not supported | Not supported | |
table | Stores the result set after a table or view is processed. | Not supported | Not supported | Not supported | |
xml | Data type of the XML data. XML instances can be stored in columns or variables of the XML type. The stored XML instance size cannot exceed 2 GB. | Not supported | Not supported | Not supported |