Research
--

Operational Resilience in Financial Institutions: Regulatory Requirements and Best Practices

By Dr. Robert Chen, Operational Risk Institute

Operational Resilience in Financial Institutions: Regulatory Requirements and Best Practices

Introduction

Operational resilience has emerged as a central regulatory priority following high-profile service disruptions and cyber incidents. This research examines regulatory frameworks, effectiveness of current approaches, and evolving best practices.

Regulatory Frameworks

UK Operational Resilience Requirements

The most comprehensive framework, implemented March 2025:

Key Requirements:

  • Identify important business services (IBS)
  • Set impact tolerances for each IBS
  • Map resources and dependencies
  • Test resilience through scenario analysis
  • Document in operational resilience self-assessment

Scope: All PRA and FCA-regulated firms above £50 million assets

Implementation Experience:

  • 98% of in-scope firms completed mapping by deadline
  • Average cost: £12-45 million for large institutions
  • Identified average 23% more critical dependencies than initially estimated

EU DORA (Digital Operational Resilience Act)

Implemented January 2025:

  • Focus on ICT risk management
  • Third-party risk oversight
  • Digital operational resilience testing
  • ICT incident reporting
  • Information sharing on cyber threats

Scope: All financial entities and critical ICT providers

US Regulatory Approach

More fragmented with sector-specific requirements:

Banking (OCC, Federal Reserve, FDIC):

  • Operational risk management expectations
  • Business continuity planning requirements
  • Cyber risk management standards
  • Recovery and resolution planning

Securities (SEC):

  • Regulation Systems Compliance and Integrity (Reg SCI)
  • Cybersecurity requirements for market infrastructure

Important Business Services Identification

Methodology

Institutions must identify services where disruption exceeds impact tolerance:

Common Criteria:

  • Customer impact (number affected, duration, magnitude)
  • Market impact (systemic importance, market function)
  • Regulatory impact (compliance obligations)
  • Financial impact (revenue, losses)

Typical IBS Categories:

  1. Payment Services: Domestic and international payments
  2. Lending: Credit origination and servicing
  3. Custody: Asset safekeeping and administration
  4. Trading: Market access and execution
  5. Data Services: Regulatory reporting, customer information

Industry Benchmarks

Survey of 85 major financial institutions reveals:

  • Average 15 identified IBS (range: 8-27)
  • Payment services universal IBS
  • 78% include retail banking operations
  • 67% include institutional trading and clearing

Impact Tolerance Setting

Regulatory Expectations

Impact tolerances must reflect:

  • Maximum tolerable level of disruption
  • Realistic assessment of stakeholder tolerance
  • Alignment with risk appetite
  • Measurable metrics

Example Tolerances:

  • Retail Payments: Maximum 4-hour disruption affecting more than 100,000 customers
  • Corporate Lending: Maximum 24-hour disruption in loan approval process
  • Trading Execution: Maximum 30-minute disruption for institutional clients

Calibration Challenges

Institutions report difficulties:

  • Balancing ambition with realism (63% cite as major challenge)
  • Quantifying customer tolerance (71%)
  • Board approval for appropriately stringent tolerances (42%)
  • Measuring compliance with tolerance during actual incidents (58%)

Mapping and Dependency Analysis

Scope of Mapping

Comprehensive mapping includes:

People: Skills, locations, key personnel Processes: Critical workflows, procedures Technology: Systems, infrastructure, data Facilities: Offices, data centers, third-party sites Information: Data sources, flows, storage

Third-Party Dependencies

Particular regulatory focus on outsourcing and third parties:

Common Dependencies:

  • Cloud services (AWS, Azure, Google Cloud): 89% of institutions
  • Payment infrastructure: 100%
  • Data vendors: 95%
  • Cybersecurity services: 76%

Concentration Risks:

  • Top 3 cloud providers serve 94% of major financial institutions
  • Single points of failure identified in 67% of initial mappings
  • Average institution has 847 third-party relationships (412 deemed critical)

Mapping Tools and Technology

Approaches:

  1. Manual Workshops: Cross-functional teams document dependencies
  2. Automated Discovery: Network mapping, asset inventory tools
  3. Process Mining: Analyze actual system interactions
  4. Hybrid: Combine methods for comprehensive view

Technology Adoption:

  • 73% use dedicated dependency mapping software
  • Average implementation cost: £2-8 million
  • Popular tools: ServiceNow, Fusion Risk, Archer

Scenario Testing

Testing Requirements

Regulators require regular testing through severe but plausible scenarios:

UK Guidance: Test at least annually, with more frequent testing for high-risk IBS EU DORA: Advanced testing every 3 years minimum (threat-led penetration testing)

Scenario Types

Cyber Attacks:

  • Ransomware affecting critical systems
  • Distributed denial of service (DDoS)
  • Data exfiltration
  • Supply chain compromise

Technology Failures:

  • Cloud provider outage
  • Data center loss
  • Network connectivity loss
  • Critical application failure

People/Facilities:

  • Loss of critical facility (fire, flood, etc.)
  • Pandemic preventing office access
  • Loss of key personnel/skills

Third-Party Failures:

  • Critical vendor service disruption
  • Payment infrastructure failure
  • Data provider outage

Testing Methodologies

Desktop Exercises: Discussion-based scenario walk-throughs (lowest cost, least disruptive) Simulations: Realistic enactment without actual disruption (moderate cost and disruption) Live Testing: Actual failover to backup systems (highest cost and risk, most realistic)

Industry Practice:

  • 100% conduct desktop exercises
  • 78% conduct simulations
  • 34% conduct live testing for critical systems

Incident Response and Recovery

Response Structures

Tiered Approach:

  1. Level 1: Technical teams respond to incidents within tolerance
  2. Level 2: Operational resilience team coordinates cross-functional response
  3. Level 3: Executive crisis management for major incidents

Key Roles:

  • Incident Commander: Overall coordination
  • Communications: Internal and external messaging
  • Technical Lead: Restoration activities
  • Business Continuity: Workaround implementation
  • Legal/Compliance: Regulatory notifications

Recovery Time Objectives

Institutions must maintain recovery capabilities within impact tolerances:

Typical RTOs:

  • Tier 1 Systems (Payment infrastructure, trading): 1-4 hours
  • Tier 2 Systems (Customer service, lending): 4-24 hours
  • Tier 3 Systems (Reporting, analytics): 24-72 hours

Achievement Rates:

  • 92% of tested scenarios meet RTO for Tier 1
  • 87% for Tier 2
  • 76% for Tier 3

Governance and Oversight

Board Responsibilities

Regulators emphasize board ownership:

Required Activities:

  • Approve IBS and impact tolerances
  • Review resilience testing results
  • Oversee material incidents
  • Ensure adequate investment in resilience

Time Commitment:

  • Dedicated resilience agenda item at least quarterly
  • Annual deep-dive on resilience strategy
  • Immediate notification of tolerance breaches

Three Lines of Defense

First Line (Business Operations):

  • Implement resilience controls
  • Monitor impact tolerances
  • Conduct business continuity planning

Second Line (Risk and Compliance):

  • Set standards and policies
  • Independent assessment of resilience
  • Regulatory liaison

Third Line (Internal Audit):

  • Independent assurance on resilience framework
  • Testing of crisis management capabilities
  • Assessment of governance effectiveness

Cost and Investment

Implementation Costs

Large Institutions (>$100B assets):

  • Initial implementation: $30-75 million
  • Annual ongoing: $15-35 million
  • Primary costs: Consulting (35%), technology (40%), personnel (25%)

Medium Institutions ($10B-$100B assets):

  • Initial: $8-20 million
  • Annual: $3-8 million

Community Banks (<$10B assets):

  • Initial: $1-3 million
  • Annual: $0.5-1.5 million

Return on Investment

Risk Reduction:

  • 67% reduction in incidents exceeding impact tolerance (post-implementation)
  • 43% faster recovery from incidents
  • 82% improvement in third-party risk awareness

Operational Benefits:

  • Improved process documentation
  • Enhanced understanding of technology dependencies
  • Better third-party management
  • Strengthened crisis management capabilities

Regulatory Benefits:

  • Reduced supervisory scrutiny for institutions with strong resilience
  • Avoidance of enforcement actions
  • Competitive advantage in bid processes requiring resilience evidence

Challenges and Emerging Issues

Data Quality

Maintaining accurate dependency maps requires ongoing effort:

  • Changes occur continuously (system upgrades, new vendors, staff changes)
  • Average 18% annual turnover in technology dependencies
  • Configuration management databases often incomplete or outdated

Solutions:

  • Automated discovery and monitoring
  • Change management integration
  • Regular validation exercises

Testing Realism

Gap between test scenarios and actual incidents:

  • Tests often optimistic on recovery capabilities
  • Stress conditions (multiple simultaneous failures) underexamined
  • Human factors (fatigue, stress, confusion) difficult to simulate

Improvements:

  • Red team testing with adversarial approach
  • Unannounced exercises
  • Extreme scenario testing

Third-Party Risk

Limited influence over third-party resilience:

  • Cannot mandate testing or access facilities
  • Contractual rights often insufficient
  • Systemic concentration risks

Approaches:

  • Industry utilities for critical services
  • Regulatory oversight of critical service providers (EU DORA approach)
  • Diversification strategies
  • Enhanced due diligence

International Cooperation

Cross-Border Resilience

Global institutions face challenges coordinating across jurisdictions:

  • Different regulatory requirements
  • Time zone complications for global recovery
  • Legal/regulatory barriers to data and system access

Solutions:

  • Regional crisis management hubs
  • Follow-the-sun recovery coordination
  • Regulatory cooperation on crisis management

Information Sharing

Industry initiatives to share resilience insights:

Financial Services Information Sharing and Analysis Center (FS-ISAC):

  • Cyber threat intelligence
  • Incident coordination
  • 7,000+ member institutions

Operational Resilience Working Groups:

  • Best practice sharing
  • Common standards development
  • Regulatory engagement

Future Evolution

Regulatory Developments

Expected requirements:

  • Extension to smaller institutions
  • Enhanced third-party oversight
  • Greater emphasis on cyber resilience
  • Climate resilience integration

Technology Trends

Cloud Architecture:

  • Multi-cloud strategies for resilience
  • Cloud-native applications with built-in resilience
  • Automated failover and recovery

Artificial Intelligence:

  • AI-driven incident detection and response
  • Predictive analytics for resilience weak points
  • Automated runbook execution

Quantum Computing:

  • Threat to current encryption (quantum readiness)
  • Opportunity for complex scenario modeling

Best Practices

Leading Institutions Demonstrate

  1. Resilience-by-Design: Build resilience into systems from inception
  2. Continuous Testing: Regular, realistic scenario testing
  3. Vendor Management Excellence: Rigorous third-party oversight
  4. Cultural Embedding: Resilience as everyone's responsibility
  5. Investment Commitment: Adequate resources aligned with risk
  6. Transparent Reporting: Proactive engagement with regulators
  7. Learning Mindset: Continuous improvement from incidents and near-misses

Recommendations

For Financial Institutions

  1. Start with IBS Identification: Foundation for entire framework
  2. Realistic Impact Tolerances: Avoid overly optimistic assessments
  3. Comprehensive Mapping: Investment pays off in crisis
  4. Regular Testing: Don't wait for actual incidents
  5. Board Engagement: Ensure senior leadership ownership
  6. Third-Party Focus: Concentration risks often underestimated

For Regulators

  1. Proportionality: Scale requirements to institution size and complexity
  2. Outcomes Focus: Emphasize actual resilience over compliance documentation
  3. Cross-Border Coordination: Harmonize requirements for global institutions
  4. Third-Party Oversight: Address systemic concentration risks
  5. Innovation Support: Allow experimentation with new resilience approaches

Conclusion

Operational resilience has transitioned from optional best practice to mandatory regulatory requirement. Evidence suggests frameworks are effective in improving preparedness, though implementation remains challenging and costly.

Success requires sustained commitment from board level through front-line operations, significant investment in technology and capabilities, and ongoing adaptation as threats and dependencies evolve. Institutions viewing resilience as strategic capability rather than compliance burden will best position themselves for inevitable future disruptions.

References

  • Bank of England, PRA, FCA (2024). "Operational Resilience: Impact Tolerances for Important Business Services"
  • European Banking Authority (2024). "Guidelines on ICT and Security Risk Management"
  • Federal Reserve, OCC, FDIC (2024). "Sound Practices to Strengthen Operational Resilience"
  • Basel Committee on Banking Supervision (2024). "Principles for Operational Resilience"