SDRS Basic Concepts

Table 1 SDRS basic concepts

Concept

Description

Production site

A production site is the data center that independently carries services in normal cases. For SDRS, it is the AZ where your servers locate. This parameter is specified when you create a protection group.

DR site

A DR site is the data center that does not carry services when the production site works properly. It is used to back up data in real time. When the production site fails (planned or unexpected), the DR site can take over the services after a failover. It can reside in the same city as the service management center or in another city.

In this version, the production and DR sites must be in different AZs in the same region.

Protection group

Used to manage a group of servers to be replicated. One protection group is for servers in one VPC. If you have multiple VPCs, you need to create multiple protection groups.

Protected instance

Indicates a server and its replication server. A protected instance belongs to one protection group. Therefore, the production site and DR site of the protected instance are the same as those of the protected instance's protection group.

Replication pair

Indicates a disk and its replication disk. A replication pair belongs to one protection group and can be attached to a protected instance in this protection group.

Planned failover

You can temporarily stop servers at the production site and then perform a planned failover to fail over services from the production site to the DR site. After the planned failover, data synchronization continues, but the DR direction is changed (from the DR site to the production site). Servers and disks at the DR site are ready to start.

Failover

The system forcibly stops the servers and disks at the production site and sets the servers and disks at the DR site to ready-to-start state. This action affects all the protected instances in the protection group. After the failover, you need to start the servers at the DR site. The protection group status changes to Failover complete, and data synchronization of the protection group stops. You need to enable protection again to restore data synchronization.

Enabling protection

This operation can be performed after a protection group is created or data synchronization stops. Once the protection is enabled, the data synchronization starts, and the synchronization progress is displayed on the web page. This operation affects all protected instances and replication pairs in the protection group.

When you click Enable Protection, the status of the protection group changes to Synchronizing, and Disable Protection is not available.

Enabling protection again

This operation can be performed after a failover. Once the protection is enabled again, the data synchronization starts, and the synchronization progress is displayed on the web page. This operation affects all protected instances and replication pairs in the protection group.

When you click Reprotect after a failover, the status of the protection group changes to Re-protecting, and Disable Protection becomes unavailable.

Disabling protection

Can be performed after the data synchronization is complete. Once the protection is disabled, the data synchronization stops, and the protection status of the protection group changes to Available.

Attaching a replication pair to a protected instance

Indicates to attach the two disks in a replication pair to the two servers in a protected instance.

Detaching a replication pair from a protected instance

Indicates to detach the two disks in a replication pair from the two servers in a protected instance.

DR direction

Indicates the data replication direction. The data replication is from the production site to the DR site when users create a protection group.

After you perform a planned failover, services at the production site are failed over to the DR site, and services at the DR site are failed over to the production site.

Protection group status

Indicates the status of a protection group when users perform an operation on the protection group, such as creating or deleting a protection group, enabling or disabling protection, or performing a planned failover or failover.

For details, see the protection group status section in the Appendixes of Storage Disaster Recovery Service API Reference.

Synchronization status

Indicates the status of the data replication between the production and DR sites.

VPC

Indicates the VPC of the protection group. A VPC facilitates internal network management and configuration, allowing secure and quick modifications to networks. The servers in the same VPC can communicate with each other, but those in different VPCs cannot communicate with each other by default.

VBD

Virtual Block Device (VBD) is the default type of disks. Disks of the VBD type support only simple Small Computer System Interface (SCSI) read and write commands. This disk type applies to enterprise office applications as well as development and testing scenarios.

SCSI

SCSI is another disk type. Disks of this type support transparent SCSI command transmission to allow the server OS to directly access the underlying storage media. In addition to simple SCSI read and write commands, SCSI disks also support advanced SCSI commands, for example, persistent lock reservation command, suitable for using the lock mechanism to ensure data security for cluster applications.

RPO

Indicates recovery point objective. It is a service switchover policy, minimizing data loss during DR switchover. The data recovery point is used as the objective to ensure that the data used for DR switchover is the latest backup data.

RTO

Indicates recovery time objective. It is the target time on the recovery of interrupted key businesses to an acceptable level. RTO is set to minimize an interruption's impacts on the services. For SDRS, recovery time objective (RTO) refers to the period from the time when users perform a planned failover or failover at the production site to the time when the servers at the DR site start to run. This period does not include the time spent on DNS configuration, security group configuration, or customer script execution, and is within 30 minutes.

DR drill

Is to verify that a DR site server can take over services from a production site server once a failover is performed.

In DR drills, you can simulate fault recovery scenarios and formulate emergency recovery plans. When a real fault occurs, the plans can be used to quickly recover services, improving service continuity.