• Data Warehouse Service

dws
  1. Help Center
  2. Data Warehouse Service
  3. Developer Guide
  4. Database Quick Start
  5. Other Operations
  6. Creating and Managing Sequences

Creating and Managing Sequences

Context

A sequence is a database object that generates unique integers according to a certain rule and is usually used to generate primary key values.

You can create a sequence for a column in either of the following methods:
  • Set the data type of a column to Table 3. A sequence will be automatically created by the database for this column.
  • Run the CREATE SEQUENCE statement to create a sequence. Set the initial value of the nextval('sequence_name') function to the default value of a column.

Procedure

Method 1: Set the data type of a column to a sequence integer. For example:
CREATE TABLE T1
(
    id    serial,
    name  text
);

If the following information is displayed, the table has been created:

CREATE TABLE

Method 2: Create a sequence and set the initial value of the nextval('sequence_name') function to the default value of a column. You can cache a specific number of sequence values to reduce the requests to the GTM, improving the performance.

  1. Create a sequence.
    CREATE SEQUENCE seq1 cache 100;

    If the following information is displayed, the sequence has been created:

    CREATE SEQUENCE
  2. Set the initial value of the nextval('sequence_name') function to the default value of a column.
    CREATE TABLE T2 
    ( 
        id   int not null default nextval('seq1'),
        name text
    );

    If the following information is displayed, the initial value of the function has been set:

    CREATE TABLE
  3. Associate the sequence with a column.

    In this way, the sequence will be deleted when you delete the column or the table where the column resides.

    ALTER SEQUENCE seq1 OWNED BY T2.id;

    If the following information is displayed, the owner has been set:

    ALTER SEQUENCE
NOTE:

After the cache is specified, the sequence may have gaps (for example, the sequence numbers are 1, 4, and 5) and cannot be saved. After a sequence is deleted, its sub-sequences will be deleted automatically. A sequence shared by multiple columns is not forbidden in a database, but you are not advised to do that.

Currently, the preceding two methods cannot be used for existing tables.

Precautions

Sequence values are generated by the GTM. By default, each request for a sequence value is sent to the GTM. The GTM calculates the result of the current value plus the step and then returns the result. The GTM is the only node that can generate sequence values and probably becomes the performance bottleneck. Therefore, you are not advised to use sequences when sequence values need to be generated frequently (for example, using BulkLoad to import data). For example, the INSERT FROM SELECT statement has poor performance in the following scenario:

CREATE SEQUENCE newSeq1;
CREATE TABLE newT1
           ( 
             id   int not null default nextval('newSeq1'), 
             name text
            );
INSERT INTO newT1(name) SELECT name from T1;

To improve the performance, run the following statements (assume that data of 10,000 rows will be imported from T1 to newT1):

INSERT INTO newT1(id, name) SELECT id,name from T1;
SELECT SETVAL('newSeq1',10000);
NOTE:

Rollback is not supported by sequence functions nextval() and setval(). The value of the setval function immediately takes effects on nextval in the current session in any cases and take effects in other sessions only when no cache is specified for them. If cache is specified for a session, it takes effect only after all the cached values have been used. To avoid duplicate values, use setval only when necessary. Do not set it to an existing sequence value or a cached sequence value.

If BulkLoad is used, set sufficient cache for newSeq1 and do not set Maxvalue or Minvalue. To improve the performance, database may push down the invocation of nextval('sequence_name') to DNs. Currently, the concurrent connection requests that can be processed by the GTM are limited. If there are too many DNs, a large number of concurrent connection requests will be sent to the GTM. In this case, you need to limit the concurrent connection of BulkLoad to save the GTM connection resources. If the table is in REPLICATION mode, the invocation cannot be pushed down and the database may break down. In addition, the database space may be exhausted. After the import is complete, do VACUUM FULL. Therefore, you are not advised to use sequences when BulkLoad is used.

After a sequence is created, a single-row table is maintained on each node to store the sequence definition and value, which is obtained from the last interaction with the GTM rather than updated in real time. The single-row table on a node does not update when other nodes request a new value from the GTM or when the sequence is modified using setval.