Regional Disaster Recovery (RDR) Architecture and Deployment in ocs-ci

Table of Contents

Overview
RDR Architecture
Key Components
Multicluster Access Patterns
OCS-CI Deployment Flow
Important Design Pieces
Configuration and Constants

Overview

Regional Disaster Recovery (RDR) is a disaster recovery solution for OpenShift Data Foundation (ODF) that enables asynchronous replication of persistent volumes across geographically distributed OpenShift clusters. RDR provides application failover and relocate capabilities between a primary and secondary cluster.

Key Characteristics:

Mode: regional-dr (async replication)
Replication Policy: Asynchronous (async)
Cluster Roles: ActiveACM (Hub), PrimaryODF, SecondaryODF
Storage Types: RBD (Ceph Block) and CephFS (Ceph Filesystem)
Deployment Modes: Greenfield and Brownfield

RDR Architecture

High-Level Architecture

        graph TB
    subgraph ACM["ACM Hub Cluster"]
        ACM_COMP["ACM Components"]
        ACM_LIST["• Advanced Cluster Management<br/>• Multicluster Engine<br/>• ODF Multicluster Orchestrator<br/>• Ramen DR Hub Operator<br/>• DRPolicy Management<br/>• DRPC Orchestration"]
        ACM_COMP -.-> ACM_LIST
    end

    subgraph PRIMARY["Primary ODF Cluster"]
        P_STORAGE["ODF Storage"]
        P_STORAGE_LIST["• RBD Mirror<br/>• Ceph Cluster<br/>• VolumeReplication"]
        P_DR["DR Components"]
        P_DR_LIST["• Ramen DR Cluster<br/>• VRG Primary<br/>• VolSync"]
        P_WORKLOAD["Active Workloads"]
        P_WORKLOAD_LIST["• Subscriptions<br/>• ApplicationSets"]

        P_STORAGE -.-> P_STORAGE_LIST
        P_DR -.-> P_DR_LIST
        P_WORKLOAD -.-> P_WORKLOAD_LIST
    end

    subgraph SECONDARY["Secondary ODF Cluster"]
        S_STORAGE["ODF Storage"]
        S_STORAGE_LIST["• RBD Mirror<br/>• Ceph Cluster<br/>• VolumeReplication"]
        S_DR["DR Components"]
        S_DR_LIST["• Ramen DR Cluster<br/>• VRG Secondary<br/>• VolSync"]
        S_WORKLOAD["Standby Workloads"]

        S_STORAGE -.-> S_STORAGE_LIST
        S_DR -.-> S_DR_LIST
    end

    ACM -->|Manages| PRIMARY
    ACM -->|Manages| SECONDARY
    PRIMARY <-->|Async Replication| SECONDARY

    style ACM fill:#e1f5ff
    style PRIMARY fill:#c8e6c9
    style SECONDARY fill:#fff9c4

Network Connectivity

RDR requires network connectivity between clusters:

Submariner : Provides secure Layer 3 connectivity
Globalnet: Enables overlapping CIDR ranges
S3 Storage: For metadata and backup storage
Latency Requirement: < 10ms RTT for hub-spoke communication

Key Components

1. ACM Hub Cluster Components

Advanced Cluster Management (ACM)

Purpose: Central management and orchestration
Version: 2.12+
Key Functions:
- Cluster lifecycle management
- Application deployment via GitOps
- Policy enforcement
- Observability

Multicluster Engine (MCE)

Purpose: Cluster provisioning and management
Deployment: Installed on ACM hub
Functions: Cluster import, managed cluster lifecycle

ODF Multicluster Orchestrator

Deployment: odf-multicluster-orchestrator-controller-manager
Namespace: openshift-operators
Purpose: Coordinates storage operations across clusters
Key Resources:
- MirrorPeer: Defines replication relationships
- StorageClusterPeer: Manages peer connections

Ramen DR Hub Operator

Purpose: DR orchestration and policy management
Key CRDs:
- DRPolicy: Defines DR policies and scheduling intervals
- DRPlacementControl (DRPC): Controls application placement
- DRCluster: Represents managed clusters in DR topology

2. Managed Cluster (Primary/Secondary) Components

ODF Storage Cluster

Components:
- Ceph cluster (Mon, OSD, MGR)
- RBD provisioner
- CephFS provisioner
- Storage classes

RBD Mirroring

Purpose: Asynchronous block storage replication
Components:
- rbd-mirror pods
- Volume Replication CRDs
- Replication secrets
Deployment Modes:
- Greenfield: New deployments with bluestore-rdr annotation
- Brownfield: Existing deployments

Ramen DR Cluster Operator

Label: app=ramen-dr-cluster
Purpose: Local DR operations on managed clusters
Key CRDs:
- VolumeReplicationGroup (VRG): Groups PVCs for replication
- VolumeReplication: Per-PVC replication control

VolSync (ODF 4.19+)

Purpose: CephFS replication using Restic/Rclone
Storage Class: ocs-storagecluster-cephfs-vrg
Components:
- ReplicationSource
- ReplicationDestination

Token Exchange Agent

Label: app=token-exchange-agent
Purpose: Secure credential exchange between clusters
Namespace: openshift-storage

3. Workload Types

Subscription-based Applications (Soon to be Deprecated)

Namespace: Application-specific
DRPC Location: Application namespace
GitOps: ACM ApplicationSet or Subscription

ApplicationSet-based Applications

Namespace: openshift-gitops
DRPC Location: openshift-gitops
GitOps: ArgoCD ApplicationSet

Discovered Applications

Purpose: Protect existing applications without GitOps
DRPC Location: openshift-dr-ops
Features:
- KubeObject protection
- Recipe-based backup
- Multi-namespace support

Multicluster Access Patterns

Context Switching in ocs-ci

The framework uses context switching to manage multiple clusters:

# Switch to ACM hub cluster
config.switch_acm_ctx()

# Switch to primary cluster
primary_config = get_primary_cluster_config()
config.switch_ctx(primary_config.MULTICLUSTER["multicluster_index"])

# Switch by cluster name
config.switch_to_cluster_by_name("cluster-name")

Cluster Roles and Indexes

# RDR Roles
RDR_ROLES = ["ActiveACM", "PrimaryODF", "SecondaryODF"]

# Optional: PassiveACM for dual-hub scenarios
if get_passive_acm_index():
    RDR_ROLES.append("PassiveACM")

# Cluster ranking
ACM_RANK = 1
MANAGED_CLUSTER_RANK = 2

DRPC Access Patterns

# Get current primary cluster
primary_cluster_name = dr_helpers.get_current_primary_cluster_name(
    namespace=workload_namespace,
    workload_type=constants.SUBSCRIPTION
)

# Get current secondary cluster
secondary_cluster_name = dr_helpers.get_current_secondary_cluster_name(
    namespace=workload_namespace,
    workload_type=constants.SUBSCRIPTION
)

# Access DRPC object
drpc_obj = DRPC(namespace=workload_namespace)
drpc_data = drpc_obj.get()

# Check DRPC action
if drpc_data["spec"]["action"] == constants.ACTION_FAILOVER:
    current_cluster = drpc_data["spec"]["failoverCluster"]
else:
    current_cluster = drpc_data["spec"]["preferredCluster"]

Replication Resource Access

# Check VolumeReplicationGroup state
vrg_obj = OCP(
    kind=constants.VOLUME_REPLICATION_GROUP,
    namespace=workload_namespace
)

# Check mirroring status on primary
config.switch_to_cluster_by_name(primary_cluster_name)
dr_helpers.wait_for_mirroring_status_ok(
    replaying_images=pvc_count
)

# Verify replication destinations on secondary
config.switch_to_cluster_by_name(secondary_cluster_name)
dr_helpers.wait_for_replication_destinations_creation(
    pvc_count, workload_namespace
)

OCS-CI Deployment Flow

Phase 1: Infrastructure Setup

1. ACM Hub Cluster Deployment
   ├── Deploy OpenShift cluster
   ├── Install ACM operator
   ├── Install MCE operator
   └── Configure observability

2. Managed Clusters Deployment (Primary & Secondary)
   ├── Deploy OpenShift clusters via ACM
   │   ├── Create/import cluster prerequisites
   │   ├── Create/import cluster via ACM UI/CLI
   │   └── Wait for cluster ready
   ├── Install ODF operator
   ├── Create StorageCluster
   └── Verify ODF deployment

Phase 2: DR Infrastructure Setup

3. DR Operators Deployment
   ├── On ACM Hub:
   │   ├── Deploy ODF Multicluster Orchestrator
   │   │   └── Verify deployment available
   │   ├── Enable MCO console plugin
   │   └── Create ServiceExporter (4.19+)
   │
   └── On Managed Clusters:
       ├── Enable RBD mirroring on StorageCluster
       ├── Deploy Ramen DR Cluster Operator
       └── Configure S3 secrets for DR

4. Network Configuration (if Submariner enabled)
   ├── Download subctl CLI
   ├── Deploy broker on primary cluster
   ├── Join clusters to broker
   └── Verify connectivity

Phase 3: DR Configuration

5. MirrorPeer Creation
   ├── Load MirrorPeer template (MIRROR_PEER_RDR)
   ├── Update cluster names in spec
   ├── Apply MirrorPeer on ACM hub
   └── Validate MirrorPeer status
       ├── Check phase: "ExchangedSecret"
       ├── Verify token-exchange-agent pods
       └── Verify rbd-mirror pods

6. DRPolicy Creation
   ├── Load DRPolicy template
   ├── Configure:
   │   ├── drClusters: [primary, secondary]
   │   ├── schedulingInterval: "5m" (default)
   │   └── replicationClassSelector (for RBD)
   ├── Apply DRPolicy on ACM hub
   └── Validate DRPolicy status: "Validated"

7. StorageClusterPeer Validation (4.19+)
   ├── Verify peer state on both clusters
   └── Verify VolSync deployment

Phase 4: Workload Deployment

8. Application Deployment with DR Protection
   ├── Deploy application (Subscription/ApplicationSet)
   ├── Create DRPC resource
   │   ├── Specify drPolicyRef
   │   ├── Set preferredCluster (primary)
   │   └── Set placementRef
   ├── Wait for VRG creation
   ├── Wait for VolumeReplication resources
   └── Verify initial replication

9. Verify DR Readiness
   ├── Check DRPC conditions:
   │   ├── PeerReady: True
   │   └── ClusterDataProtected: True
   ├── Verify mirroring status
   └── Verify replication destinations

Deployment Class Hierarchy

Deployment (base class)
    ├── do_deploy_rdr()
    │   └── Calls get_multicluster_dr_deployment()
    │
    └── get_rdr_conf()
        └── Returns DR configuration dict

MultiClusterDROperatorsDeploy (base DR class)
    ├── deploy_dr_multicluster_orchestrator()
    ├── configure_mirror_peer()
    ├── deploy_dr_policy()
    └── enable_acm_observability()

RDRMultiClusterDROperatorsDeploy (RDR-specific)
    └── deploy()
        ├── Deploy orchestrator on all ACM hubs
        ├── Enable MCO console plugin
        ├── Create ServiceExporter (4.19+)
        ├── Configure MirrorPeer
        ├── Deploy RBD DR operations
        ├── Enable ACM observability
        ├── Deploy DRPolicy
        ├── Validate StorageClusterPeer (4.19+)
        └── Configure backup (if needed)

Key Deployment Methods

`Deployment.do_deploy_rdr()`

Location: ocs_ci/deployment/deployment.py:739

def do_deploy_rdr(self):
    """Call Regional DR deploy"""
    if config.ENV_DATA.get("skip_dr_deployment", False):
        return
    if config.multicluster:
        dr_conf = self.get_rdr_conf()
        deploy_dr = get_multicluster_dr_deployment()(dr_conf)
        deploy_dr.deploy()

`RDRMultiClusterDROperatorsDeploy.deploy()`

Location: ocs_ci/deployment/deployment.py:4115

Main deployment orchestration for RDR setup.

`MultiClusterDROperatorsDeploy.configure_mirror_peer()`

Location: ocs_ci/deployment/deployment.py:3401

Creates and validates MirrorPeer resource.

`MultiClusterDROperatorsDeploy.deploy_dr_policy()`

Location: ocs_ci/deployment/deployment.py:3583

Creates DRPolicy with cluster relationships.

Important Design Pieces

1. Asynchronous Replication

Scheduling Interval: Defines RPO (Recovery Point Objective)

Default: 5 minutes
IBM Cloud Managed: 10 minutes
Configurable via DRPolicy

Replication Flow:

        sequenceDiagram
    participant App as Application
    participant PVC as Primary PVC
    participant RBD as RBD Mirror Daemon
    participant Snap as Snapshot
    participant SecPVC as Secondary PVC

    App->>PVC: Write data
    PVC->>RBD: Capture changes
    Note over RBD: Continuous monitoring

    loop Every Scheduling Interval
        RBD->>Snap: Create snapshot
        Snap->>SecPVC: Replicate snapshot
        SecPVC->>SecPVC: Apply changes
    end

    Note over PVC,SecPVC: Async Replication (5-10 min RPO)

2. Failover vs Relocate

Failover (Disaster Scenario)

Trigger: Primary cluster unavailable
Action: spec.action: Failover
Target: spec.failoverCluster
Process:
1. Detect primary cluster failure
2. Update DRPC with failover action
3. Promote secondary VRG to primary
4. Start application on secondary
5. Delete resources from primary (when available)

Relocate (Planned Migration)

Trigger: Planned move to another cluster
Action: spec.action: Relocate
Target: spec.preferredCluster
Process:
1. Ensure both clusters healthy
2. Update DRPC with relocate action
3. Quiesce application on current primary
4. Ensure final sync complete
5. Promote new primary VRG
6. Start application on new primary
7. Demote old primary VRG to secondary

3. VolumeReplicationGroup (VRG)

Purpose: Groups PVCs for coordinated replication

States:

Primary: Active cluster with read/write access
Secondary: Standby cluster receiving replicated data

Key Responsibilities:

Manage VolumeReplication resources
Coordinate snapshots
Handle promotion/demotion
Manage PVC protection

4. Consistency Groups (4.21+)

Purpose: Ensure crash-consistent snapshots across multiple PVCs

Configuration:

# Enabled by default in RDR mode for 4.21+
cg_enabled = config.ENV_DATA.get("cg_enabled", True)

Benefits:

Application-consistent backups
Coordinated snapshot timing
Reduced RPO for multi-PVC applications

5. OSD Deployment Modes

Greenfield (4.14-4.17)

metadata:
  annotations:
    ocs.openshift.io/clusterIsDisasterRecoveryTarget: "true"

OSDs deployed with bluestore-rdr store type
Optimized for DR workloads
Automatic configuration

Brownfield

Existing OSD deployments
Standard bluestore
Manual DR configuration

6. Hub Recovery and Backup

Backup Components:

ACM resources
DR policies
Cluster configurations

Backup Schedule:

Resource: schedule-acm
Namespace: ACM namespace
Policy: backup-restore-enabled

Recovery Process:

configure_rdr_hub_recovery()
    ├── Create backup schedule
    ├── Validate DPA (Data Protection Application)
    └── Verify policy compliance

Configuration and Constants

Key Constants

Mode and Policy

RDR_MODE = "regional-dr"
RDR_REPLICATION_POLICY = "async"
RDR_DR_POLICY_IBM_CLOUD_MANAGED = "odr-policy-10m"

OSD Deployment

RDR_OSD_MODE_GREENFIELD = "greenfield"
RDR_OSD_MODE_BROWNFIELD = "brownfield"

Storage Classes

RDR_VOLSYNC_CEPHFILESYSTEM_SC = "ocs-storagecluster-cephfs-vrg"
RDR_CUSTOM_RBD_POOL = "rdr-test-storage-pool"
RDR_CUSTOM_RBD_STORAGECLASS = "rbd-cnv-custom-sc"

Namespaces

DR_DEFAULT_NAMESPACE = "openshift-dr-system"
DR_OPS_NAMESPACE = "openshift-dr-ops"  # For discovered apps

Labels

TOKEN_EXCHANGE_AGENT_LABEL = "app=token-exchange-agent"
RBD_MIRROR_APP_LABEL = "app=rook-ceph-rbd-mirror"
RAMEN_DR_CLUSTER_OPERATOR_APP_LABEL = "app=ramen-dr-cluster"
RDR_VM_PROTECTION_LABEL = "ramendr.openshift.io/k8s-resource-selector"

Templates

MIRROR_PEER_RDR = "ocs_ci/templates/multicluster/mirror_peer_rdr.yaml"
DR_POLICY_YAML = "ocs_ci/templates/multicluster/dr_policy_hub.yaml"

Cluster Roles

RDR_ROLES = ["ActiveACM", "PrimaryODF", "SecondaryODF"]

# Optional for dual-hub scenarios
# RDR_ROLES.append("PassiveACM")

Upgrade Order

RDR has a specific upgrade sequence:

UPGRADE_TEST_ORDER = {
    ORDER_OCP_UPGRADE: 1,      # OCP upgrade
    ORDER_OCS_UPGRADE: 2,      # ODF upgrade
    ORDER_MCO_UPGRADE: 3,      # Multicluster Orchestrator
    ORDER_DR_HUB_UPGRADE: 4,   # DR Hub operator
    ORDER_ACM_UPGRADE: 5,      # ACM upgrade
}

Upgrade Sequence:

ACM Hub OCP upgrade
Primary managed cluster OCP upgrade
Secondary managed cluster OCP upgrade
Primary ODF upgrade
Secondary ODF upgrade
ACM MCO operator upgrade
ACM DR Hub operator upgrade
Primary/Secondary DR cluster operator upgrade (automatic)
ACM upgrade (if selected)

Configuration Parameters

# DR configuration dictionary
dr_conf = {
    "rbd_dr_scenario": True/False,      # Enable RBD DR
    "cephfs_dr_scenario": True/False,   # Enable CephFS DR
}

# Environment variables
ENV_DATA = {
    "skip_dr_deployment": False,
    "rdr_osd_deployment_mode": "greenfield",
    "cg_enabled": True,
    "submariner_source": "upstream",
    "configure_acm_to_import_mce": False,
}

# Multicluster configuration
MULTICLUSTER = {
    "multicluster_mode": "regional-dr",
    "dr_cluster_relations": [
        ["primary-cluster", "secondary-cluster"]
    ],
}

Testing and Validation

Running RDR Deployment and Tests

Deployment Command

To deploy RDR infrastructure across three clusters (ACM Hub, Primary ODF, Secondary ODF), use the following run-ci command:

run-ci \
  multicluster 3 tests/ \
  -m deployment \
  --deploy \
  --ocsci-conf conf/ocsci/multicluster_mode_rdr.yaml \
  --color=yes \
  --squad-analysis \
  --cluster1 \
    --cluster-name acm-hub-cluster \
    --cluster-path /home/user/clusters/acm-hub-cluster/openshift-cluster-dir \
    --ocp-version 4.17 \
    --ocs-version 4.17 \
    --osd-size 512 \
    --ocsci-conf conf/deployment/aws/ipi_3az_rhcos_compactmode_3m_0w.yaml \
    --ocsci-conf conf/ocsci/multicluster_active_acm_cluster.yaml \
    --ocsci-conf conf/ocsci/submariner_downstream.yaml \
    --ocsci-conf conf/ocsci/multicluster_dr_rbd.yaml \
  --cluster2 \
    --cluster-name primary-odf-cluster \
    --cluster-path /home/user/clusters/primary-odf-cluster/openshift-cluster-dir \
    --ocp-version 4.17 \
    --ocs-version 4.17 \
    --osd-size 512 \
    --ocsci-conf conf/deployment/aws/ipi_3az_rhcos_3m_3w.yaml \
    --ocsci-conf conf/ocsci/multicluster_primary_cluster.yaml \
    --ocsci-conf conf/ocsci/multicluster_dr_rbd.yaml \
    --ocsci-conf conf/ocsci/submariner_downstream.yaml \
  --cluster3 \
    --cluster-name secondary-odf-cluster \
    --cluster-path /home/user/clusters/secondary-odf-cluster/openshift-cluster-dir \
    --ocp-version 4.17 \
    --ocs-version 4.17 \
    --osd-size 512 \
    --ocsci-conf conf/deployment/aws/ipi_3az_rhcos_3m_3w.yaml \
    --ocsci-conf conf/ocsci/multicluster_dr_rbd.yaml \
    --ocsci-conf conf/ocsci/submariner_downstream.yaml

Command Breakdown:

multicluster 3: Deploy 3 clusters in multicluster mode
-m deployment --deploy: Run deployment marker and execute deployment
--ocsci-conf conf/ocsci/multicluster_mode_rdr.yaml: Enable RDR mode
--cluster1: ACM Hub cluster configuration (compact mode, 3 masters, 0 workers)
--cluster2: Primary ODF cluster configuration (3 masters, 3 workers)
--cluster3: Secondary ODF cluster configuration (3 masters, 3 workers)
--ocsci-conf conf/ocsci/multicluster_dr_rbd.yaml: Enable RBD DR scenario
--ocsci-conf conf/ocsci/submariner_downstream.yaml: Enable Submariner networking

Running RDR Tests

After deployment, run RDR tests with tier1 and rdr markers:

run-ci \
  multicluster 3 \
  -m "tier1 and rdr" \
  --ocsci-conf conf/ocsci/multicluster_mode_rdr.yaml \
  --color=yes \
  --cluster1 \
    --cluster-name acm-hub-cluster \
    --cluster-path /home/user/clusters/acm-hub-cluster/openshift-cluster-dir \
    --ocsci-conf conf/ocsci/multicluster_active_acm_cluster.yaml \
  --cluster2 \
    --cluster-name primary-odf-cluster \
    --cluster-path /home/user/clusters/primary-odf-cluster/openshift-cluster-dir \
    --ocsci-conf conf/ocsci/multicluster_primary_cluster.yaml \
  --cluster3 \
    --cluster-name secondary-odf-cluster \
    --cluster-path /home/user/clusters/secondary-odf-cluster/openshift-cluster-dir \

Test Command Options:

-m "tier1 and rdr": Run tests marked with both tier1 and rdr markers
Test path: tests/functional/disaster-recovery/regional-dr/ for all RDR tests
Specific test: Add test file and method name for targeted testing

Test Categories

Failover Tests - test_failover.py
- Primary cluster down scenarios
- Primary cluster up scenarios
- RBD and CephFS interfaces
Relocate Tests - test_relocate.py
- Planned migration
- Application continuity
Failover and Relocate - test_failover_and_relocate.py
- Combined scenarios
- CLI and UI testing
Discovered Apps - test_failover_and_relocate_discovered_apps.py
- Non-GitOps applications
- KubeObject protection
- Recipe-based backup
Hub Recovery - test_neutral_hub_failure_and_recovery.py
- Hub cluster failure
- Backup and restore
Node Operations - test_node_operations_during_failover_relocate.py
- Node failures during DR operations
- Resilience testing

Test Markers

@rdr  # Marks test as RDR-specific
@turquoise_squad  # Squad ownership
@tier1  # Test tier
@acceptance  # Acceptance test

Validation Helpers

Key validation functions in ocs_ci/helpers/dr_helpers.py:

get_current_primary_cluster_name(): Identify active cluster
get_current_secondary_cluster_name(): Identify standby cluster
wait_for_mirroring_status_ok(): Verify replication health
wait_for_all_resources_creation(): Verify workload deployment
wait_for_all_resources_deletion(): Verify cleanup
wait_for_replication_destinations_creation(): Verify secondary resources
verify_last_kubeobject_protection_time(): Validate backup timing

Troubleshooting

Common Issues

MirrorPeer not reaching ExchangedSecret
- Check token-exchange-agent pods
- Verify network connectivity
- Check S3 secret configuration
DRPolicy not Validated
- Verify both clusters are healthy
- Check MirrorPeer status
- Verify StorageCluster configuration
Replication not working
- Check rbd-mirror pods
- Verify VolumeReplication resources
- Check mirroring status in Ceph
Failover stuck
- Check DRPC conditions
- Verify VRG state
- Check for resource conflicts

Debug Commands

# Check DRPC status
oc get drpc -n <namespace> -o yaml

# Check VRG status
oc get vrg -n openshift-dr-ops -o yaml

# Check MirrorPeer
oc get mirrorpeer -o yaml

# Check DRPolicy
oc get drpolicy -o yaml

# Check replication status
oc get volumereplication -n <namespace>

# Check Ceph mirroring
ceph rbd mirror pool status <pool-name>

References

Key Files

Deployment: ocs_ci/deployment/deployment.py
Multicluster Deployment: ocs_ci/deployment/multicluster_deployment.py
DR Helpers: ocs_ci/helpers/dr_helpers.py
Constants: ocs_ci/ocs/constants.py
DRPC Resource: ocs_ci/ocs/resources/drpc.py
ACM Integration: ocs_ci/ocs/acm/acm.py
Submariner: ocs_ci/deployment/acm.py

Documentation

Red Hat Advanced Cluster Management for Kubernetes
OpenShift Data Foundation Documentation
Ramen DR Operator Documentation
Submariner Documentation

Summary

RDR in ocs-ci provides a comprehensive framework for testing Regional Disaster Recovery scenarios in OpenShift Data Foundation. The architecture supports:

Asynchronous replication between geographically distributed clusters
Automated failover for disaster scenarios
Planned relocate for maintenance and optimization
Multiple workload types: Subscriptions, ApplicationSets, Discovered Apps
Storage flexibility: RBD and CephFS support
Consistency groups for multi-PVC applications
Hub recovery for ACM cluster failures

The deployment flow is fully automated through ocs-ci, enabling comprehensive testing of DR scenarios across different ODF versions, platforms, and configurations.