Operational resilience in insurance: Top tips for doing it well (Part two)Rob McHugh
In part one of our blog series, we looked at the regulation in focus and the importance of a robust operating model in meeting its requirements.
In this blog, we explore the key ways in which organisations can turn the regulatory challenge of completing remediation activities into an opportunity ahead of the March 2025 deadline.
Embrace cloud to boost resilience
When considering how to achieve the desired outcomes for their operational resilience programme, firms should consider what role embracing cloud infrastructure can play. In a paper on how financial services organisations can increase operational resilience, Google has identified Cybersecurity, Pandemics, Environmental and Infrastructure Risk, Geopolitical Risk, Third Party Risk, and Technology Risk as key operational risks that cloud infrastructure can help to mitigate. The paper states that there is growing recognition of the role that cloud can play in reducing risk for financial services firms through improvements to resilience.
A well-defined cloud strategy can help firms obtain greater availability, improved security, superior scalability and accessibility, and cost-efficiency through economies of scale. A multi-cloud strategy, which involves adopting a mixture of services from two or more cloud providers and sharing workloads between each, can allow firms to choose the services that provide value and flexibility. At the same time, they can be confident that an exit strategy to limit concentration risk is possible from a single provider.
Our recent whitepaper, Cloud optimisation best practices: Finding you Cloud’s silver lining, explains that cloud operators are trying to make portability between clouds easier with the creation of managed Kubernetes-based services. This allows clouds to run across multiple cloud platforms and apps to keep running without interruption, even if one service goes down. As a result, the chance of concentration risk failure is mitigated.
Although there are several advantages to multi-cloud platforms, it can be difficult to design and deploy successfully. Therefore, the importance of getting the right advice when embarking on a multi-cloud strategy to avoid costly delays and redesigns to your implementation cannot be underestimated. Organisations should take a holistic approach when implementing cloud. It is not enough to simply rely on the technical implementation, particularly when attempting to build operational resilience into the cloud operating model. Organisations should consider the design of the organisation across the eight layers mentioned earlier to support and enable the adoption of cloud across the firm.
Read next: Operational resilience in insurance: Top tips for doing it well (Part one)
How operational excellence can help you succeed
In their discussion paper on operational resilience, the PRA sets out the criteria that should be applied when identifying the processes followed to perform a key business service. The identified processes should be those which are considered necessary for delivering the key business service. Indicators such as resource capacity, productivity, turnaround time, and the quality of the outputs of each operational process can be used to understand the current efficiency of the process, or whether there are vulnerabilities to be addressed.
When mapping impacted processes and identifying any vulnerabilities, organisations can use the principles behind operational excellence to ensure their remedial activities. Operational excellence encourages process efficiency, which aims to align the delivery of processes with customer demand and helps manage waste of the end-to-end value stream. Process efficiency looks to optimise the capacity of the whole organisation, by focussing on the above indicators - ensuring processes maximise resource capacity, increase the productivity of each process, optimise the turnaround time, and improve the quality of output.
When mapping existing operational processes and assessing their resilience, organisations should also capture potential process improvements. This will allow organisations to begin proactively planning their remediation activities by identifying where they can streamline their operations and improve or remove inefficient processes.
Set accurate impact tolerances and carry out meaningful scenario testing
Once the key business services have been identified, firms can set impact tolerances for each business service. This should include metrics that will allow them to assess the current level of disruption and what an acceptable level would be. Metrics can include the duration of an outage, the number of customers who will be impacted, and the monetary cost of any disruption.
Firms should also consider how they will go on to monitor each of these key business services, who needs to be informed if an impact tolerance is breached, and who is responsible for remediation in the event of service disruption. Each of these considerations will help to define a comprehensive scenario testing plan.
There is an expectation from regulators that firms frequently test their ability to deliver key business services to ensure that set tolerance levels are not breached in the event of service disruption.
During testing, insurance firms need to identify a range of scenarios where they may feasibly exceed their tolerance levels against the key metrics for each service and use the results to identify vulnerabilities. Following this testing exercise, regulators expect firms to take action to remediate these vulnerabilities to improve their resilience to future disruption.
Accurate tolerances cannot be set without a clear understanding of the processes that are in scope. An accurate ‘As-Is’ assessment of the organisation’s operating model will show how an operational outage in one area may impact other business functions. This will help to identify which teams need to be included in any threshold testing exercises.
By embracing cloud, organisations open a wide range of new metrics, traces, and data points about their IT estate, which can form a crucial part of any impact tolerances. Cloud technologies allow firms to observe how their cloud estate is performing at both the infrastructure and application level. Without this, it becomes impossible to understand where and why issues are occurring, and therefore the full impact of any outage.
In a nutshell
Operational resilience regulation is coming, and the complex nature of the insurance market means its impacts will be significant. By following the advice set out in this blog series, embracing the change, and using it as an opportunity to improve operational processes, firms can obtain an edge as ‘the most reliable’ offering on the market.
Whitepaper: Data mesh: How can the insurance sector protect their future?
Operational resilience in Financial Services: Our approach
Public cloud in Financial Services: Navigating a changing regulatory environment
Leveraging operational excellence to optimise payments processing
RegTech and Legacy: The path to better adoption