• Data Warehouse Service

dws
  1. Help Center
  2. Data Warehouse Service
  3. Developer Guide
  4. Tutorial: Tuning Table Design
  5. Step 3: Selecting Storage and Compression Modes

Step 3: Selecting Storage and Compression Modes

Selecting a Storage Mode

DWS supports hybrid row and column storage. You can create row- or column-store tables as needed in your business scenarios.

Generally, if a table contains many columns (called a wide table) and its query involves only a few columns, column storage is recommended. If a table contains only a few columns and a query includes most of them, row storage is recommended.

Storage Mode

Scenario

Row storage

Point query returned with a few records. This is a simple index-based query.

In this scenario, data is frequently added, modified, and deleted.

Column storage

Statistics analysis queries where tables are frequently grouped and joined.

Ad hoc queries where query conditions are uncertain and no index can be determined for row-store tables.

Sample tables used in this tutorial are typical multi-column TPC-DS tables where many statistical analysis queries are performed. Therefore, the column storage mode is recommended.

WITH (ORIENTATION = column)

Selecting a Compression Level

In scenarios where I/O is large (much data is read and written) and CPU is sufficient (little data is computed), select a high compression ratio. In scenarios where I/O is small and CPU is insufficient, select a low compression ratio. Based on this principle, you are advised to select different compression ratios and test and compare the results to select the optimal compression ratio as required. Specify a compressions ratio using the COMPRESSION parameter. The supported values are as follows:

  • The valid value of column-store tables is YES, NO, LOW, MIDDLE, or HIGH, and the default value is LOW.
  • The valid values of row-store tables are YES and NO, and the default is NO.

The service scenarios applicable to each compression level are described in the following table.

Compression Level

Application Scenario

LOW

The system CPU usage is high and the disk storage space is sufficient.

MIDDLE

The system CPU usage is moderate and the disk storage space is insufficient.

HIGH

The system CPU usage is low and the disk storage space is insufficient.

No compression ratio is specified in Step 1: Creating an Initial Table and Loading Sample Data, and the low compression ratio is selected by DWS by default. Specify COMPRESSION to MIDDLE, and compare the result to that when COMPRESSION is set to LOW.

The following is an example of selecting a storage mode and the MIDDLE compression ratio for a table.

CREATE TABLE store_sales
(
    ss_sold_date_sk           integer                       ,
    ss_sold_time_sk           integer                       ,
    ss_item_sk                integer               not null,
    ss_customer_sk            integer                       ,
    ss_cdemo_sk               integer                       ,
    ss_hdemo_sk               integer                       ,
    ss_addr_sk                integer                       ,
    ss_store_sk               integer                       ,
    ss_promo_sk               integer                       ,
    ss_ticket_number          bigint               not null,
    ss_quantity               integer                       ,
    ss_wholesale_cost         decimal(7,2)                  ,
    ss_list_price             decimal(7,2)                  ,
    ss_sales_price            decimal(7,2)                  ,
    ss_ext_discount_amt       decimal(7,2)                  ,
    ss_ext_sales_price        decimal(7,2)                  ,
    ss_ext_wholesale_cost     decimal(7,2)                  ,
    ss_ext_list_price         decimal(7,2)                  ,
    ss_ext_tax                decimal(7,2)                  ,
    ss_coupon_amt             decimal(7,2)                  ,
    ss_net_paid               decimal(7,2)                  ,
    ss_net_paid_inc_tax       decimal(7,2)                  ,
    ss_net_profit             decimal(7,2)                  
) 
WITH (ORIENTATION = column,COMPRESSION=middle);