# Regional Disaster Recovery (RDR) Architecture and Deployment in ocs-ci ## Table of Contents 1. [Overview](#overview) 2. [RDR Architecture](#rdr-architecture) 3. [Key Components](#key-components) 4. [Multicluster Access Patterns](#multicluster-access-patterns) 5. [OCS-CI Deployment Flow](#ocs-ci-deployment-flow) 6. [Important Design Pieces](#important-design-pieces) 7. [Configuration and Constants](#configuration-and-constants) --- (overview)= ## Overview Regional Disaster Recovery (RDR) is a disaster recovery solution for OpenShift Data Foundation (ODF) that enables asynchronous replication of persistent volumes across geographically distributed OpenShift clusters. RDR provides application failover and relocate capabilities between a primary and secondary cluster. **Key Characteristics:** - **Mode**: `regional-dr` (async replication) - **Replication Policy**: Asynchronous (`async`) - **Cluster Roles**: ActiveACM (Hub), PrimaryODF, SecondaryODF - **Storage Types**: RBD (Ceph Block) and CephFS (Ceph Filesystem) - **Deployment Modes**: Greenfield and Brownfield --- (rdr-architecture)= ## RDR Architecture ### High-Level Architecture ```mermaid graph TB subgraph ACM["ACM Hub Cluster"] ACM_COMP["ACM Components"] ACM_LIST["• Advanced Cluster Management
• Multicluster Engine
• ODF Multicluster Orchestrator
• Ramen DR Hub Operator
• DRPolicy Management
• DRPC Orchestration"] ACM_COMP -.-> ACM_LIST end subgraph PRIMARY["Primary ODF Cluster"] P_STORAGE["ODF Storage"] P_STORAGE_LIST["• RBD Mirror
• Ceph Cluster
• VolumeReplication"] P_DR["DR Components"] P_DR_LIST["• Ramen DR Cluster
• VRG Primary
• VolSync"] P_WORKLOAD["Active Workloads"] P_WORKLOAD_LIST["• Subscriptions
• ApplicationSets"] P_STORAGE -.-> P_STORAGE_LIST P_DR -.-> P_DR_LIST P_WORKLOAD -.-> P_WORKLOAD_LIST end subgraph SECONDARY["Secondary ODF Cluster"] S_STORAGE["ODF Storage"] S_STORAGE_LIST["• RBD Mirror
• Ceph Cluster
• VolumeReplication"] S_DR["DR Components"] S_DR_LIST["• Ramen DR Cluster
• VRG Secondary
• VolSync"] S_WORKLOAD["Standby Workloads"] S_STORAGE -.-> S_STORAGE_LIST S_DR -.-> S_DR_LIST end ACM -->|Manages| PRIMARY ACM -->|Manages| SECONDARY PRIMARY <-->|Async Replication| SECONDARY style ACM fill:#e1f5ff style PRIMARY fill:#c8e6c9 style SECONDARY fill:#fff9c4 ``` ### Network Connectivity RDR requires network connectivity between clusters: - **Submariner** : Provides secure Layer 3 connectivity - **Globalnet**: Enables overlapping CIDR ranges - **S3 Storage**: For metadata and backup storage - **Latency Requirement**: < 10ms RTT for hub-spoke communication --- (key-components)= ## Key Components ### 1. ACM Hub Cluster Components #### Advanced Cluster Management (ACM) - **Purpose**: Central management and orchestration - **Version**: 2.12+ - **Key Functions**: - Cluster lifecycle management - Application deployment via GitOps - Policy enforcement - Observability #### Multicluster Engine (MCE) - **Purpose**: Cluster provisioning and management - **Deployment**: Installed on ACM hub - **Functions**: Cluster import, managed cluster lifecycle #### ODF Multicluster Orchestrator - **Deployment**: `odf-multicluster-orchestrator-controller-manager` - **Namespace**: `openshift-operators` - **Purpose**: Coordinates storage operations across clusters - **Key Resources**: - MirrorPeer: Defines replication relationships - StorageClusterPeer: Manages peer connections #### Ramen DR Hub Operator - **Purpose**: DR orchestration and policy management - **Key CRDs**: - `DRPolicy`: Defines DR policies and scheduling intervals - `DRPlacementControl (DRPC)`: Controls application placement - `DRCluster`: Represents managed clusters in DR topology ### 2. Managed Cluster (Primary/Secondary) Components #### ODF Storage Cluster - **Components**: - Ceph cluster (Mon, OSD, MGR) - RBD provisioner - CephFS provisioner - Storage classes #### RBD Mirroring - **Purpose**: Asynchronous block storage replication - **Components**: - `rbd-mirror` pods - Volume Replication CRDs - Replication secrets - **Deployment Modes**: - **Greenfield**: New deployments with `bluestore-rdr` annotation - **Brownfield**: Existing deployments #### Ramen DR Cluster Operator - **Label**: `app=ramen-dr-cluster` - **Purpose**: Local DR operations on managed clusters - **Key CRDs**: - `VolumeReplicationGroup (VRG)`: Groups PVCs for replication - `VolumeReplication`: Per-PVC replication control #### VolSync (ODF 4.19+) - **Purpose**: CephFS replication using Restic/Rclone - **Storage Class**: `ocs-storagecluster-cephfs-vrg` - **Components**: - ReplicationSource - ReplicationDestination #### Token Exchange Agent - **Label**: `app=token-exchange-agent` - **Purpose**: Secure credential exchange between clusters - **Namespace**: `openshift-storage` ### 3. Workload Types #### Subscription-based Applications (Soon to be Deprecated) - **Namespace**: Application-specific - **DRPC Location**: Application namespace - **GitOps**: ACM ApplicationSet or Subscription #### ApplicationSet-based Applications - **Namespace**: `openshift-gitops` - **DRPC Location**: `openshift-gitops` - **GitOps**: ArgoCD ApplicationSet #### Discovered Applications - **Purpose**: Protect existing applications without GitOps - **DRPC Location**: `openshift-dr-ops` - **Features**: - KubeObject protection - Recipe-based backup - Multi-namespace support --- (multicluster-access-patterns)= ## Multicluster Access Patterns ### Context Switching in ocs-ci The framework uses context switching to manage multiple clusters: ```python # Switch to ACM hub cluster config.switch_acm_ctx() # Switch to primary cluster primary_config = get_primary_cluster_config() config.switch_ctx(primary_config.MULTICLUSTER["multicluster_index"]) # Switch by cluster name config.switch_to_cluster_by_name("cluster-name") ``` ### Cluster Roles and Indexes ```python # RDR Roles RDR_ROLES = ["ActiveACM", "PrimaryODF", "SecondaryODF"] # Optional: PassiveACM for dual-hub scenarios if get_passive_acm_index(): RDR_ROLES.append("PassiveACM") # Cluster ranking ACM_RANK = 1 MANAGED_CLUSTER_RANK = 2 ``` ### DRPC Access Patterns ```python # Get current primary cluster primary_cluster_name = dr_helpers.get_current_primary_cluster_name( namespace=workload_namespace, workload_type=constants.SUBSCRIPTION ) # Get current secondary cluster secondary_cluster_name = dr_helpers.get_current_secondary_cluster_name( namespace=workload_namespace, workload_type=constants.SUBSCRIPTION ) # Access DRPC object drpc_obj = DRPC(namespace=workload_namespace) drpc_data = drpc_obj.get() # Check DRPC action if drpc_data["spec"]["action"] == constants.ACTION_FAILOVER: current_cluster = drpc_data["spec"]["failoverCluster"] else: current_cluster = drpc_data["spec"]["preferredCluster"] ``` ### Replication Resource Access ```python # Check VolumeReplicationGroup state vrg_obj = OCP( kind=constants.VOLUME_REPLICATION_GROUP, namespace=workload_namespace ) # Check mirroring status on primary config.switch_to_cluster_by_name(primary_cluster_name) dr_helpers.wait_for_mirroring_status_ok( replaying_images=pvc_count ) # Verify replication destinations on secondary config.switch_to_cluster_by_name(secondary_cluster_name) dr_helpers.wait_for_replication_destinations_creation( pvc_count, workload_namespace ) ``` --- (ocs-ci-deployment-flow)= ## OCS-CI Deployment Flow ### Phase 1: Infrastructure Setup ``` 1. ACM Hub Cluster Deployment ├── Deploy OpenShift cluster ├── Install ACM operator ├── Install MCE operator └── Configure observability 2. Managed Clusters Deployment (Primary & Secondary) ├── Deploy OpenShift clusters via ACM │ ├── Create/import cluster prerequisites │ ├── Create/import cluster via ACM UI/CLI │ └── Wait for cluster ready ├── Install ODF operator ├── Create StorageCluster └── Verify ODF deployment ``` ### Phase 2: DR Infrastructure Setup ``` 3. DR Operators Deployment ├── On ACM Hub: │ ├── Deploy ODF Multicluster Orchestrator │ │ └── Verify deployment available │ ├── Enable MCO console plugin │ └── Create ServiceExporter (4.19+) │ └── On Managed Clusters: ├── Enable RBD mirroring on StorageCluster ├── Deploy Ramen DR Cluster Operator └── Configure S3 secrets for DR 4. Network Configuration (if Submariner enabled) ├── Download subctl CLI ├── Deploy broker on primary cluster ├── Join clusters to broker └── Verify connectivity ``` ### Phase 3: DR Configuration ``` 5. MirrorPeer Creation ├── Load MirrorPeer template (MIRROR_PEER_RDR) ├── Update cluster names in spec ├── Apply MirrorPeer on ACM hub └── Validate MirrorPeer status ├── Check phase: "ExchangedSecret" ├── Verify token-exchange-agent pods └── Verify rbd-mirror pods 6. DRPolicy Creation ├── Load DRPolicy template ├── Configure: │ ├── drClusters: [primary, secondary] │ ├── schedulingInterval: "5m" (default) │ └── replicationClassSelector (for RBD) ├── Apply DRPolicy on ACM hub └── Validate DRPolicy status: "Validated" 7. StorageClusterPeer Validation (4.19+) ├── Verify peer state on both clusters └── Verify VolSync deployment ``` ### Phase 4: Workload Deployment ``` 8. Application Deployment with DR Protection ├── Deploy application (Subscription/ApplicationSet) ├── Create DRPC resource │ ├── Specify drPolicyRef │ ├── Set preferredCluster (primary) │ └── Set placementRef ├── Wait for VRG creation ├── Wait for VolumeReplication resources └── Verify initial replication 9. Verify DR Readiness ├── Check DRPC conditions: │ ├── PeerReady: True │ └── ClusterDataProtected: True ├── Verify mirroring status └── Verify replication destinations ``` ### Deployment Class Hierarchy ``` Deployment (base class) ├── do_deploy_rdr() │ └── Calls get_multicluster_dr_deployment() │ └── get_rdr_conf() └── Returns DR configuration dict MultiClusterDROperatorsDeploy (base DR class) ├── deploy_dr_multicluster_orchestrator() ├── configure_mirror_peer() ├── deploy_dr_policy() └── enable_acm_observability() RDRMultiClusterDROperatorsDeploy (RDR-specific) └── deploy() ├── Deploy orchestrator on all ACM hubs ├── Enable MCO console plugin ├── Create ServiceExporter (4.19+) ├── Configure MirrorPeer ├── Deploy RBD DR operations ├── Enable ACM observability ├── Deploy DRPolicy ├── Validate StorageClusterPeer (4.19+) └── Configure backup (if needed) ``` ### Key Deployment Methods #### `Deployment.do_deploy_rdr()` **Location:** `ocs_ci/deployment/deployment.py:739` ```python def do_deploy_rdr(self): """Call Regional DR deploy""" if config.ENV_DATA.get("skip_dr_deployment", False): return if config.multicluster: dr_conf = self.get_rdr_conf() deploy_dr = get_multicluster_dr_deployment()(dr_conf) deploy_dr.deploy() ``` #### `RDRMultiClusterDROperatorsDeploy.deploy()` **Location:** `ocs_ci/deployment/deployment.py:4115` Main deployment orchestration for RDR setup. #### `MultiClusterDROperatorsDeploy.configure_mirror_peer()` **Location:** `ocs_ci/deployment/deployment.py:3401` Creates and validates MirrorPeer resource. #### `MultiClusterDROperatorsDeploy.deploy_dr_policy()` **Location:** `ocs_ci/deployment/deployment.py:3583` Creates DRPolicy with cluster relationships. --- (important-design-pieces)= ## Important Design Pieces ### 1. Asynchronous Replication **Scheduling Interval**: Defines RPO (Recovery Point Objective) - Default: 5 minutes - IBM Cloud Managed: 10 minutes - Configurable via DRPolicy **Replication Flow**: ```mermaid sequenceDiagram participant App as Application participant PVC as Primary PVC participant RBD as RBD Mirror Daemon participant Snap as Snapshot participant SecPVC as Secondary PVC App->>PVC: Write data PVC->>RBD: Capture changes Note over RBD: Continuous monitoring loop Every Scheduling Interval RBD->>Snap: Create snapshot Snap->>SecPVC: Replicate snapshot SecPVC->>SecPVC: Apply changes end Note over PVC,SecPVC: Async Replication (5-10 min RPO) ``` ### 2. Failover vs Relocate #### Failover (Disaster Scenario) - **Trigger**: Primary cluster unavailable - **Action**: `spec.action: Failover` - **Target**: `spec.failoverCluster` - **Process**: 1. Detect primary cluster failure 2. Update DRPC with failover action 3. Promote secondary VRG to primary 4. Start application on secondary 5. Delete resources from primary (when available) #### Relocate (Planned Migration) - **Trigger**: Planned move to another cluster - **Action**: `spec.action: Relocate` - **Target**: `spec.preferredCluster` - **Process**: 1. Ensure both clusters healthy 2. Update DRPC with relocate action 3. Quiesce application on current primary 4. Ensure final sync complete 5. Promote new primary VRG 6. Start application on new primary 7. Demote old primary VRG to secondary ### 3. VolumeReplicationGroup (VRG) **Purpose**: Groups PVCs for coordinated replication **States**: - `Primary`: Active cluster with read/write access - `Secondary`: Standby cluster receiving replicated data **Key Responsibilities**: - Manage VolumeReplication resources - Coordinate snapshots - Handle promotion/demotion - Manage PVC protection ### 4. Consistency Groups (4.21+) **Purpose**: Ensure crash-consistent snapshots across multiple PVCs **Configuration**: ```python # Enabled by default in RDR mode for 4.21+ cg_enabled = config.ENV_DATA.get("cg_enabled", True) ``` **Benefits**: - Application-consistent backups - Coordinated snapshot timing - Reduced RPO for multi-PVC applications ### 5. OSD Deployment Modes #### Greenfield (4.14-4.17) ```yaml metadata: annotations: ocs.openshift.io/clusterIsDisasterRecoveryTarget: "true" ``` - OSDs deployed with `bluestore-rdr` store type - Optimized for DR workloads - Automatic configuration #### Brownfield - Existing OSD deployments - Standard bluestore - Manual DR configuration ### 6. Hub Recovery and Backup **Backup Components**: - ACM resources - DR policies - Cluster configurations **Backup Schedule**: - Resource: `schedule-acm` - Namespace: ACM namespace - Policy: `backup-restore-enabled` **Recovery Process**: ``` configure_rdr_hub_recovery() ├── Create backup schedule ├── Validate DPA (Data Protection Application) └── Verify policy compliance ``` --- (configuration-and-constants)= ## Configuration and Constants ### Key Constants #### Mode and Policy ```python RDR_MODE = "regional-dr" RDR_REPLICATION_POLICY = "async" RDR_DR_POLICY_IBM_CLOUD_MANAGED = "odr-policy-10m" ``` #### OSD Deployment ```python RDR_OSD_MODE_GREENFIELD = "greenfield" RDR_OSD_MODE_BROWNFIELD = "brownfield" ``` #### Storage Classes ```python RDR_VOLSYNC_CEPHFILESYSTEM_SC = "ocs-storagecluster-cephfs-vrg" RDR_CUSTOM_RBD_POOL = "rdr-test-storage-pool" RDR_CUSTOM_RBD_STORAGECLASS = "rbd-cnv-custom-sc" ``` #### Namespaces ```python DR_DEFAULT_NAMESPACE = "openshift-dr-system" DR_OPS_NAMESPACE = "openshift-dr-ops" # For discovered apps ``` #### Labels ```python TOKEN_EXCHANGE_AGENT_LABEL = "app=token-exchange-agent" RBD_MIRROR_APP_LABEL = "app=rook-ceph-rbd-mirror" RAMEN_DR_CLUSTER_OPERATOR_APP_LABEL = "app=ramen-dr-cluster" RDR_VM_PROTECTION_LABEL = "ramendr.openshift.io/k8s-resource-selector" ``` #### Templates ```python MIRROR_PEER_RDR = "ocs_ci/templates/multicluster/mirror_peer_rdr.yaml" DR_POLICY_YAML = "ocs_ci/templates/multicluster/dr_policy_hub.yaml" ``` ### Cluster Roles ```python RDR_ROLES = ["ActiveACM", "PrimaryODF", "SecondaryODF"] # Optional for dual-hub scenarios # RDR_ROLES.append("PassiveACM") ``` ### Upgrade Order RDR has a specific upgrade sequence: ```python UPGRADE_TEST_ORDER = { ORDER_OCP_UPGRADE: 1, # OCP upgrade ORDER_OCS_UPGRADE: 2, # ODF upgrade ORDER_MCO_UPGRADE: 3, # Multicluster Orchestrator ORDER_DR_HUB_UPGRADE: 4, # DR Hub operator ORDER_ACM_UPGRADE: 5, # ACM upgrade } ``` **Upgrade Sequence**: 1. ACM Hub OCP upgrade 2. Primary managed cluster OCP upgrade 3. Secondary managed cluster OCP upgrade 4. Primary ODF upgrade 5. Secondary ODF upgrade 6. ACM MCO operator upgrade 7. ACM DR Hub operator upgrade 8. Primary/Secondary DR cluster operator upgrade (automatic) 9. ACM upgrade (if selected) ### Configuration Parameters ```python # DR configuration dictionary dr_conf = { "rbd_dr_scenario": True/False, # Enable RBD DR "cephfs_dr_scenario": True/False, # Enable CephFS DR } # Environment variables ENV_DATA = { "skip_dr_deployment": False, "rdr_osd_deployment_mode": "greenfield", "cg_enabled": True, "submariner_source": "upstream", "configure_acm_to_import_mce": False, } # Multicluster configuration MULTICLUSTER = { "multicluster_mode": "regional-dr", "dr_cluster_relations": [ ["primary-cluster", "secondary-cluster"] ], } ``` --- ## Testing and Validation ### Running RDR Deployment and Tests #### Deployment Command To deploy RDR infrastructure across three clusters (ACM Hub, Primary ODF, Secondary ODF), use the following `run-ci` command: ```bash run-ci \ multicluster 3 tests/ \ -m deployment \ --deploy \ --ocsci-conf conf/ocsci/multicluster_mode_rdr.yaml \ --color=yes \ --squad-analysis \ --cluster1 \ --cluster-name acm-hub-cluster \ --cluster-path /home/user/clusters/acm-hub-cluster/openshift-cluster-dir \ --ocp-version 4.17 \ --ocs-version 4.17 \ --osd-size 512 \ --ocsci-conf conf/deployment/aws/ipi_3az_rhcos_compactmode_3m_0w.yaml \ --ocsci-conf conf/ocsci/multicluster_active_acm_cluster.yaml \ --ocsci-conf conf/ocsci/submariner_downstream.yaml \ --ocsci-conf conf/ocsci/multicluster_dr_rbd.yaml \ --cluster2 \ --cluster-name primary-odf-cluster \ --cluster-path /home/user/clusters/primary-odf-cluster/openshift-cluster-dir \ --ocp-version 4.17 \ --ocs-version 4.17 \ --osd-size 512 \ --ocsci-conf conf/deployment/aws/ipi_3az_rhcos_3m_3w.yaml \ --ocsci-conf conf/ocsci/multicluster_primary_cluster.yaml \ --ocsci-conf conf/ocsci/multicluster_dr_rbd.yaml \ --ocsci-conf conf/ocsci/submariner_downstream.yaml \ --cluster3 \ --cluster-name secondary-odf-cluster \ --cluster-path /home/user/clusters/secondary-odf-cluster/openshift-cluster-dir \ --ocp-version 4.17 \ --ocs-version 4.17 \ --osd-size 512 \ --ocsci-conf conf/deployment/aws/ipi_3az_rhcos_3m_3w.yaml \ --ocsci-conf conf/ocsci/multicluster_dr_rbd.yaml \ --ocsci-conf conf/ocsci/submariner_downstream.yaml ``` **Command Breakdown:** - `multicluster 3`: Deploy 3 clusters in multicluster mode - `-m deployment --deploy`: Run deployment marker and execute deployment - `--ocsci-conf conf/ocsci/multicluster_mode_rdr.yaml`: Enable RDR mode - `--cluster1`: ACM Hub cluster configuration (compact mode, 3 masters, 0 workers) - `--cluster2`: Primary ODF cluster configuration (3 masters, 3 workers) - `--cluster3`: Secondary ODF cluster configuration (3 masters, 3 workers) - `--ocsci-conf conf/ocsci/multicluster_dr_rbd.yaml`: Enable RBD DR scenario - `--ocsci-conf conf/ocsci/submariner_downstream.yaml`: Enable Submariner networking #### Running RDR Tests After deployment, run RDR tests with tier1 and rdr markers: ```bash run-ci \ multicluster 3 \ -m "tier1 and rdr" \ --ocsci-conf conf/ocsci/multicluster_mode_rdr.yaml \ --color=yes \ --cluster1 \ --cluster-name acm-hub-cluster \ --cluster-path /home/user/clusters/acm-hub-cluster/openshift-cluster-dir \ --ocsci-conf conf/ocsci/multicluster_active_acm_cluster.yaml \ --cluster2 \ --cluster-name primary-odf-cluster \ --cluster-path /home/user/clusters/primary-odf-cluster/openshift-cluster-dir \ --ocsci-conf conf/ocsci/multicluster_primary_cluster.yaml \ --cluster3 \ --cluster-name secondary-odf-cluster \ --cluster-path /home/user/clusters/secondary-odf-cluster/openshift-cluster-dir \ ``` **Test Command Options:** - `-m "tier1 and rdr"`: Run tests marked with both tier1 and rdr markers - Test path: `tests/functional/disaster-recovery/regional-dr/` for all RDR tests - Specific test: Add test file and method name for targeted testing ### Test Categories 1. **Failover Tests** - `test_failover.py` - Primary cluster down scenarios - Primary cluster up scenarios - RBD and CephFS interfaces 2. **Relocate Tests** - `test_relocate.py` - Planned migration - Application continuity 3. **Failover and Relocate** - `test_failover_and_relocate.py` - Combined scenarios - CLI and UI testing 4. **Discovered Apps** - `test_failover_and_relocate_discovered_apps.py` - Non-GitOps applications - KubeObject protection - Recipe-based backup 5. **Hub Recovery** - `test_neutral_hub_failure_and_recovery.py` - Hub cluster failure - Backup and restore 6. **Node Operations** - `test_node_operations_during_failover_relocate.py` - Node failures during DR operations - Resilience testing ### Test Markers ```python @rdr # Marks test as RDR-specific @turquoise_squad # Squad ownership @tier1 # Test tier @acceptance # Acceptance test ``` ### Validation Helpers Key validation functions in `ocs_ci/helpers/dr_helpers.py`: - `get_current_primary_cluster_name()`: Identify active cluster - `get_current_secondary_cluster_name()`: Identify standby cluster - `wait_for_mirroring_status_ok()`: Verify replication health - `wait_for_all_resources_creation()`: Verify workload deployment - `wait_for_all_resources_deletion()`: Verify cleanup - `wait_for_replication_destinations_creation()`: Verify secondary resources - `verify_last_kubeobject_protection_time()`: Validate backup timing --- ## Troubleshooting ### Common Issues 1. **MirrorPeer not reaching ExchangedSecret** - Check token-exchange-agent pods - Verify network connectivity - Check S3 secret configuration 2. **DRPolicy not Validated** - Verify both clusters are healthy - Check MirrorPeer status - Verify StorageCluster configuration 3. **Replication not working** - Check rbd-mirror pods - Verify VolumeReplication resources - Check mirroring status in Ceph 4. **Failover stuck** - Check DRPC conditions - Verify VRG state - Check for resource conflicts ### Debug Commands ```bash # Check DRPC status oc get drpc -n -o yaml # Check VRG status oc get vrg -n openshift-dr-ops -o yaml # Check MirrorPeer oc get mirrorpeer -o yaml # Check DRPolicy oc get drpolicy -o yaml # Check replication status oc get volumereplication -n # Check Ceph mirroring ceph rbd mirror pool status ``` --- ## References ### Key Files - **Deployment**: `ocs_ci/deployment/deployment.py` - **Multicluster Deployment**: `ocs_ci/deployment/multicluster_deployment.py` - **DR Helpers**: `ocs_ci/helpers/dr_helpers.py` - **Constants**: `ocs_ci/ocs/constants.py` - **DRPC Resource**: `ocs_ci/ocs/resources/drpc.py` - **ACM Integration**: `ocs_ci/ocs/acm/acm.py` - **Submariner**: `ocs_ci/deployment/acm.py` ### Documentation - Red Hat Advanced Cluster Management for Kubernetes - OpenShift Data Foundation Documentation - Ramen DR Operator Documentation - Submariner Documentation --- ## Summary RDR in ocs-ci provides a comprehensive framework for testing Regional Disaster Recovery scenarios in OpenShift Data Foundation. The architecture supports: - **Asynchronous replication** between geographically distributed clusters - **Automated failover** for disaster scenarios - **Planned relocate** for maintenance and optimization - **Multiple workload types**: Subscriptions, ApplicationSets, Discovered Apps - **Storage flexibility**: RBD and CephFS support - **Consistency groups** for multi-PVC applications - **Hub recovery** for ACM cluster failures The deployment flow is fully automated through ocs-ci, enabling comprehensive testing of DR scenarios across different ODF versions, platforms, and configurations.