DWS provides you with various methods of using this service, such as the DWS management console, DWS client, and REST APIs. This section describes the main functions of DWS.
A data warehouse cluster contains nodes with the same flavor in the same subnet. These nodes jointly provide services. DWS provides a professional, efficient, and centralized management console, allowing you to quickly apply for clusters, easily manage data warehouses, and focus on data and services.
Main functions of cluster management are described as follows:
You can specify the node quantity and node type based on service requirements to quickly create a cluster or purchase prepaid nodes and create a cluster.
A snapshot is a complete backup that records point-in-time configuration data and service data of a data warehouse cluster. A snapshot can be used to restore a cluster at a certain point in time. You can manually create snapshots for a cluster or enable automatic snapshot creation (periodic). Automatic snapshots have a limited retention period. You can copy automatic snapshots to generate manual snapshots for long-term retention.
When you restore a cluster from a snapshot, the system creates a new cluster with the same flavor and node quantity as the original one, and imports the snapshot data.
You can delete snapshots that are no longer needed to release the storage space.
As the service volume increases, the current scale of a cluster may not meet service requirements. In this case, you can scale out the cluster by adding compute nodes to it. Services are not interrupted during the scale-out.
Restarting a cluster may cause data loss in running services. If you have to restart a cluster, ensure that there is no running service and all data has been saved.
You can delete a cluster when you do not need it. Deleting a cluster is risky and may cause data loss. Therefore, exercise caution when performing this operation.
DWS allows you to manage clusters and snapshots in either of the following ways:
Use the management console to access data warehouse clusters. After you have registered a public cloud account, log in to the management console and choose Data Warehouse Service.
For more information about cluster management, see section Managing Clusters.
Use REST APIs provided by DWS to manage clusters. If you need to integrate DWS into a third-party system for secondary development, use APIs to access the service.
For details, see the Data Warehouse Service API Reference.
After a data warehouse cluster is created, you can use the SQL client to connect to the cluster and perform operations such as creating a database, managing the database, importing and exporting data, and querying data.
DWS provides petabyte-level (PB-level) high-performance databases with the following features:
DWS has comprehensive SQL capabilities:
For details about the SQL syntax and database operation guidance, download and see the Data Warehouse Service Database Developer Guide.
DWS supports efficient data import from multiple data sources. The following lists typical data import modes. For details, see section "Data Import" in the Data Warehouse Service Database Developer Guide.
In addition, DWS supports data import using mainstream third-party ETL tools.
You can call standard interfaces, such as Java Database Connectivity (JDBC), Open Database Connectivity (ODBC), Python, and third-party psycopg2 to access databases in clusters.
DWS integrates with Cloud Eye, allowing you to monitor compute nodes and databases in the cluster in real time. For details, see section Monitoring a Cluster.
DWS provides the following self-developed tools, which you can download from the DWS management console. For details about how to use the tools, see the Data Warehouse Service Tool Guide.
gsql is a command line SQL client tool running on the Linux OS. It is used to connect to the database in a data warehouse cluster and operate and maintain the database.
Data Studio is a Graphical User Interface (GUI) SQL client tool running on the Windows OS. It is used to connect to the database in a data warehouse cluster, manage the database and database objects, edit, run, and debug SQL scripts, and view the execution plans.
GDS is a data service tool provided by DWS. It works with the foreign table mechanism to implement high-speed data import and export.
The GDS tool package needs to be installed on the server where the data source file is located. This server is called the data server or the GDS server.