(Optional) Creating a DataArts Studio Incremental Package¶
DataArts Studio provides basic and incremental packages. If the basic package cannot meet your requirements, you can create an incremental package. Before you create an incremental package, ensure that you have a DataArts Studio instance.
DataArts Studio provides the following types of incremental packages:
DataArts Migration incremental package
CDM clusters can migrate data to the cloud and integrate data into the data lake. It provides wizard-based configuration and management and can integrate data from a single table or an entire database incrementally or periodically. The DataArts Studio instance has no CDM cluster. To use CDM functions, you must create a DataArts Migration incremental package.
Background¶
If you create a DataArts Migration incremental package, the system automatically creates a CDM cluster based on the specifications you select.
If you create a DataArts DataService incremental package, the system automatically creates a DataArts DataService cluster based on the specifications you select.
You can choose More > View Packages to view incremental packages.
Creating a CDM Cluster¶
Locate an enabled instance and click Create.
On the displayed page, set parameters based on Table 1.
Table 1 Parameters for the CDM incremental package¶ Parameter
Description
Package
Select DataArts Migration.
AZ
When you create a DataArts Studio instance or incremental package for the first time, there is no requirement on the AZ.
When you create a new DataArts Studio instance or incremental package, determine whether to select the same AZ as the existing one based on your DR and network latency demands.
If your application requires good DR capability, deploy resources in different AZs in the same region.
If your application requires a low network latency between instances, deploy resources in the same AZ.
Workspace
Select the workspace where the DataArts Migration incremental package will be used. The CDM cluster can only be used in this workspace.
Enterprise Project
If the CDM cluster is associated with multiple workspaces, you need to specify an enterprise project for the CDM cluster.
Cluster
Customize the cluster name.
Instance
The following CDM cluster flavors are available:
cdm.large: 8 vCPUs and 16 GB of memory. The maximum and assured bandwidths are 3 Gbit/s and 0.8 Gbit/s. Up to 16 jobs can be executed concurrently.
cdm.xlarge: 16 vCPUs and 32 GB of memory. The maximum and assured bandwidths are 10 Gbit/s and 4 Gbit/s. Up to 32 jobs can be executed concurrently. This flavor is suitable for migrating terabytes of data that requires a bandwidth of 10GE.
cdm.4xlarge: 64 vCPUs and 128 GB of memory. The maximum and assured bandwidths are 40 Gbit/s and 36 Gbit/s. Up to 128 jobs can be executed concurrently.
The free ECS with 4 vCPUs and 8 GB memory provided by DataArts Studio can run only one job.
VPC
VPC to which the CDM cluster in the DataArts Studio instance belongs. A VPC is a secure, isolated, and logical network environment.
If you want to connect the DataArts Studio instance or CDM cluster to a cloud service (such as DWS, MRS, and RDS), ensure that the CDM cluster can communicate with the cloud service. If the CDM cluster and the cloud service are in the same region, VPC, subnet, and security group, they can communicate with each other by default. If the CDM cluster and the cloud service are in the same VPC but in different subnets or security groups, you must configure routing rules and security group rules.
For details about the operations on VPCs, subnets, and security groups, see Virtual Private Cloud User Guide.
Note
After the CDM instance is created, the VPC cannot be changed.
Subnet
Subnet to which the CDM cluster in the DataArts Studio instance belongs. A subnet provides dedicated network resources that are isolated from other networks, improving network security.
If you want to connect the DataArts Studio instance or CDM cluster to a cloud service (such as DWS, MRS, and RDS), ensure that the CDM cluster can communicate with the cloud service. If the CDM cluster and the cloud service are in the same region, VPC, subnet, and security group, they can communicate with each other by default. If the CDM cluster and the cloud service are in the same VPC but in different subnets or security groups, you must configure routing rules and security group rules.
For details about the operations on VPCs, subnets, and security groups, see Virtual Private Cloud User Guide.
Note
After the CDM instance is created, the VPC cannot be changed.
Security Group
Security group to which the CDM cluster in the DataArts Studio instance belongs. A security group is a set of ECS access rules. It provides access policies for ECSs that have the same security protection requirements and are mutually trusted in a VPC.
If you want to connect the DataArts Studio instance or CDM cluster to a cloud service (such as DWS, MRS, and RDS), ensure that the CDM cluster can communicate with the cloud service. If the CDM cluster and the cloud service are in the same region, VPC, subnet, and security group, they can communicate with each other by default. If the CDM cluster and the cloud service are in the same VPC but in different subnets or security groups, you must configure routing rules and security group rules.
For details about the operations on VPCs, subnets, and security groups, see Virtual Private Cloud User Guide.
Note
After the CDM instance is created, the security group cannot be changed.
IPv6 Dual Stack
If you enable this function, both private IPv4 and IPv6 addresses can be used to access the cluster.
Note
If you enable this function, you can only select subnets for which IPv6 CIDR blocks are enabled. If you want to select a subnet for which IPv6 CIDR blocks are disabled, enable IPv6 CIDR blocks for the subnet in the VPC service.
IPv6 dual stack can be enabled for private IP addresses, but not for public IP addresses.
Important
You cannot modify the specifications of an existing cluster. If you need higher specifications, create another cluster.
Click Create Now, confirm the specifications, and click Next.
View the CDM cluster in the corresponding workspace.