Configuring Time Models

Flink provides two time models: processing time and event time.

DLI allows you to specify the time model during creation of the source stream and temporary stream.

Configuring Processing Time

Processing time refers to the system time, which is irrelevant to the data timestamp.

Syntax

CREATE SOURCE STREAM stream_name(...) WITH (...)
TIMESTAMP BY proctime.proctime;
CREATE TEMP STREAM stream_name(...)
TIMESTAMP BY proctime.proctime;

Description

To set the processing time, you only need to add proctime.proctime following TIMESTAMP BY. You can directly use the proctime field later.

Precautions

None

Example

CREATE SOURCE STREAM student_scores (
  student_number STRING, /* Student ID */
  student_name STRING, /* Name */
  subject STRING, /* Subject */
  score INT /* Score */
)
WITH (
  type = "dis",
  region = "",
  channel = "dliinput",
  partition_count = "1",
  encode = "csv",
  field_delimiter=","
)TIMESTAMP BY proctime.proctime;

INSERT INTO score_greate_90
SELECT student_name, sum(score) over (order by proctime RANGE UNBOUNDED PRECEDING)
FROM student_scores;

Configuring Event Time

Event Time refers to the time when an event is generated, that is, the timestamp generated during data generation.

Syntax

CREATE SOURCE STREAM stream_name(...) WITH (...)
TIMESTAMP BY {attr_name}.rowtime
SET WATERMARK (RANGE {time_interval} | ROWS {literal}, {time_interval});

Description

To set the event time, you need to select a certain attribute in the stream as the timestamp and set the watermark policy.

Out-of-order events or late events may occur due to network faults. The watermark must be configured to trigger the window for calculation after waiting for a certain period of time. Watermarks are mainly used to process out-of-order data before generated events are sent to DLI during stream processing.

The following two watermark policies are available:

  • By time interval

    SET WATERMARK(range interval {time_unit}, interval {time_unit})
    
  • By event quantity

    SET WATERMARK(rows literal, interval {time_unit})
    

Note

Parameters are separated by commas (,). The first parameter indicates the watermark sending interval and the second indicates the maximum event delay.

Precautions

None

Example

  • Send a watermark every 10s the time2 event is generated. The maximum event latency is 20s.

    CREATE SOURCE STREAM student_scores (
      student_number STRING, /* Student ID */
      student_name STRING, /* Name */
      subject STRING, /* Subject */
      score INT, /* Score */
      time2 TIMESTAMP
    )
    WITH (
      type = "dis",
      region = "",
      channel = "dliinput",
      partition_count = "1",
      encode = "csv",
      field_delimiter=","
    )
    TIMESTAMP BY time2.rowtime
    SET WATERMARK (RANGE interval 10 second, interval 20 second);
    
    INSERT INTO score_greate_90
    SELECT student_name, sum(score) over (order by time2 RANGE UNBOUNDED PRECEDING)
    FROM student_scores;
    
  • Send the watermark every time when 10 pieces of data are received, and the maximum event latency is 20s.

    CREATE SOURCE STREAM student_scores (
      student_number STRING, /* Student ID */
      student_name STRING, /* Name */
      subject STRING, /* Subject */
      score INT, /* Score */
      time2 TIMESTAMP
    )
    WITH (
      type = "dis",
      region = "",
      channel = "dliinput",
      partition_count = "1",
      encode = "csv",
      field_delimiter=","
    )
    TIMESTAMP BY time2.rowtime
    SET WATERMARK (ROWS 10, interval 20 second);
    
    INSERT INTO score_greate_90
    SELECT student_name, sum(score) over (order by time2 RANGE UNBOUNDED PRECEDING)
    FROM student_scores;