Every year, companies and governments incur significant losses due to incidents that result from their exposure to risk. If these losses are to be stemmed, then managing risk is of paramount importance. However, considering the complex environments in which most organisations operate, that is no easy task, especially in the context of a threat landscape that grows ever-more sophisticated, with advancements in technology, increasing business complexity, and greater independencies all in the mix.
On the technological front, while digital transformation and interconnectivity are valuable, they also intensify the frequency and severity of cyber-attacks, which cost the global economy $600 billion annually at an average of $3.86 million per breach. As for business complexity, the pace of change in laws, regulations, standards, and new technologies has resulted in an increasing number of errors as people struggle to keep up.
Here, information technology departments bear the brunt as they experience a seemingly never-ending set of challenges and struggle to adapt to shifts in the legal and regulatory landscape. These dizzying complexities are costing organisations 10.2% of their annual profits on average.
For its part, the globalisation and interconnectivity of systems, people, and processes have a force multiplier effect on the impact of attacks or incidents. With the threat landscape exacerbated by these factors, inadequate risk management can have devastating consequences.
For example, the 2010 Deepwater Horizon oil rig explosion in the Gulf of Mexico killed 11 workers and caused an oil spill, which saw more than 200 million gallons of oil pour into the sea. In what was one of the worst oil disasters in human history, the 87-day spill caused irreversible damage to human life, wildlife, the environment, and local economies.
At the root of the disaster were a series of mistakes and system failures, poor safety culture, absence of a crisis management plan, and underestimation of multiple red flags. Following the oil spill, British Petroleum BP lost 55% of its shareholder value, incurred more than $60 billion in associated costs and was forced to divert time and resources away from its core mission in order to address myriad legal claims, clean-up the environment, and work on rebuilding its corporate image.
Similarly, one of the biggest power outages of all times was the result of poor risk management and failure of critical infrastructure testing. In August 2003, a software bug in the alarm system at the control room of FirstEnergy Corporation left operators unaware of the need to re-distribute power after a high-voltage power line in northern Ohio brushed against overgrown trees. The electricity grid became overloaded, triggering a domino effect that left 50 million people across Southeast Canada and Northeastern US without power for up to two days.
Where backup generators did not fail, some essential services remained in operation, but hospitals and emergency rooms quickly filled with people in panic or in need of treatment for overheating. Cellular services were also interrupted because of the increased demand caused by the blackout. By the end of the crisis, 11 people had died and more than $6 billion of damages had been reported.
What would have been a manageable local blackout cascaded into a massive disaster that prevented FirstEnergy Corporation, and the government in general, from delivering on their mission and carrying out essential functions. A thorough risk assessment would have highlighted the need for frequent maintenance of bushes and trees, while periodic testing of the power grid system would have identified software bugs and the critical interconnectivities across power lines.
Challenges added by cybersecurity
As a major denominator in resilience, cyber risk factors need to be integrated within the organisation’s risk management, continuity management and testing and exercises programmes and systems. Business leaders have always looked to implement new technologies and innovations to enhance their day-to-day operations and gain competitive advantage. In fact, over the past decade, advancements in automation, digitalisation, artificial intelligence, machine learning, the Internet of things, and so forth, have transformed thousands of organisations, from the factory floor to the corporate board room.
However, while this degree of interconnectivity has enabled great strides in sectors and industries world-wide, it also amplifies organisations’ exposure to risk and the severity of the incidents they face. The clear majority of corporations today engage in risk management and have business continuity measures in place, but the cyber risk factors now plaguing our digital age are often conspicuously missing from the top of corporate strategic agendas.
One reason for this is that technological advances have, in some cases, transformed the business landscape so quickly that leaders are yet to develop a deep understanding of precisely how their particular business is exposed to cyber threats and their impact. In other cases, organisations have not yet been stung hard enough by cyber-attacks for resilience to been seen as a strategic priority.
Nevertheless, resilience should be an integral part of the cyber strategy and should permeate day-to-day operations. To this end, organisations need to build awareness of the importance of resilience and the need for it to be integrated seamlessly across all plans, activities, and systems relating to their risk management programmes.
The best way to prepare for the unforeseen is by assessing strategic options and tactical plans through testing and exercises. Testing and exercises unlock benefits associated with building preparedness, increasing resilience and sustaining performance. Depending on the expected outcomes, different approaches can be used, which broadly fall under two categories: discussion-based exercises and operations-based exercises.
Where cyber risks are concerned, resilience can be built through a two-way approach: top-down and bottom-up. A top-down-approach integrates cyber risk factors within the Resilience Equation, while a bottom-up approach considers business and corporate resilience through a number of cybersecurity functions.
The top-down approach focuses on integrating cybersecurity factors into the risk management, continuity management, and testing and exercises programmes or systems.
As a first step, the organisation integrates cybersecurity factors within its risk management programmes. It then assesses the potential to two pacts of various cyber risks on overall business before developing a clear understanding of how particular risk factors would affect specific operations. Having established these potential impacts, the organisation is able to devise actionable mitigation plans to preempt risks and address incidents swiftly, should they occur.
Next, cyber risk factors are further addressed during the business impact analysis and become an integral part of the business continuity risk assessment. Finally, the organisation conducts exercises to test readiness and preparedness for a range of cyber-threat-related scenarios.
The bottom-up approach centres on injecting business resilience requirements through five cybersecurity functions: identify, protect, detect, respond, and recover. Information Technology professionals within the organisation should also apply the resilience guiding principles, redundancy, buffering, and adaptiveness in day-to-day activities. Furthermore, in instances where predefined tolerance thresholds are exceeded, the organisation should ensure that the risks or threats in question are escalated to the corporate risk register.
To successfully navigate complexity, interconnectivity, and uncertainty, organisations must establish a bespoke strategy that integrates all elements of the Resilience Equation: risk management, continuity management, and testing and exercises.
By proactively managing risk, the Resilience Equation protects the mission and reputation of an organisation. Furthermore, by establishing an effective continuity system, organisations can ensure that essential functions are restored in time when disruption or disaster strike. Meanwhile, through systematic exercises, strategic options, and tactical plans can be rigorously tested so that any entity–no matter how big or small–can be ready for the unforeseen. The risk landscape is more complex and dynamic than ever, but it is not insurmountable. For mission success in today’s world, organisations must embark on a journey to build their defenses–a journey that begins with the implementation of the Resilience Equation.
As a major denominator in resilience, cyber risk factors need to be integrated within the organisation’s risk management, continuity management and testing and exercises programmes and systems.
What is the resilience equation
No organisation will ever be impervious to risk, but by building resilience it is possible to mitigate the severity of threats and bounce back when a negative event occurs. To become resilient, organisations must be aware of their future threats and current weaknesses, and they must take informed strategic and tactical decisions in order to prepare for risks and respond effectively to internal and external events.
Building resilience involves decreasing the likelihood of disruptive events and managing the consequences when such events do occur, as the following equation illustrates:
Resilience = Identifying risks and tackling their likelihood + Managing consequences of disruptive events
Resilience = Risk Management + Continuity Management + Testing and Exercises
Here, risk management looks at how to assess threats, vulnerabilities, and impact in order to prioritise and mitigate risks, while continuity management focuses on mission execution and begins with acceptance that some disruptions will inevitably succeed and some functionality will be lost as a result.
In addition to assessing threats and weaknesses, any resilience building efforts should also be informed by the understanding that networks are comprised of many individual nodes that may operate independently. So, while there may be many paths to expose vulnerability and many ways to fail, there are also multiple ways for a network to quickly heal itself.
Testing and exercises provide continuous review and assurance of the effectiveness of the internal control system as well as an opportunity to practice and evaluate organisational preparedness in a simulated and controlled environment. Through testing and exercises, eventual gaps or control deficiencies are identified and mitigated; they also provide an opportunity for organisations to practice and evaluate their preparedness in a simulated and controlled environment. The information gathered from these exercises may then be used to further refine resilience, emergency, and continuity plans.
The advent of technology-based approaches to simulation enables exercises to be conducted using immersive technology in addition to traditional techniques, such as full-scale exercises. Immersive Technology presents artificial environments that replicate the real world and offers the possibility to learn faster and connect with others. Through a mix of virtual and augmented reality, various scenarios, such as earthquakes, tsunamis, and fires are simulated, and resilience is tested to ensure networks, systems, and staff are responding to stress and disruption as expected.
Continuity management system
As with comprehensive risk management programmes, there are several key elements that must be factored-in to the development of robust continuity management systems, capable of safeguarding organisations against potential losses and disruptions while ensuring availability of critical processes.
Governance
From a governance perspective, it is important to clearly assign responsibilities in order to ensure that continuity management capabilities are established and maintained.
Business impact analysis
Assessing the potential impact of disruption is vital. To this end, organisations should identify critical processes that will have the greatest impact if disrupted. It is also essential to determine the maximum tolerable period of disruption to critical processes before the organisation’s mission is threatened. In addition, the recovery time objective and recovery point objective must also be determined.
The recovery time objective refers to the exact time when each critical process or essential function needs to be resumed in the event of disruption, while the recovery point objective is the point at which information must be restored for a given activity to operate on resumption mode. Also essential in the business impact analysis is the identification of key activities required to conduct the critical processes, and the quantification of resources people, buildings, equipment, information, third parties, required to maintain them.
Risk assessment
In addition to the business impact assessment, it is also essential to conduct risk assessments in order to identify potential risks that could cause disruptions to mission delivery, and to determine the likelihood of them occurring. By assessing risk in this way, organisations can prioritise risk reduction activities.
Continuity strategy
A robust continuity management system should include a strategy that ensures critical processes can meet recovery time objectives and that identifies the actions needed to enable delivery of the organisation’s mission.
Continuity plans
Plans should be put in place to manage emergencies, crisis communication, continuity, and recovery of critical processes supporting the mission of the organisation.
Testing and exercises
Once the continuity management system is up and running, testing and exercises should be carried out on a regular basis to ensure that it is working effectively and efficiently over time.
Continuous improvement
Training of staff with continuity responsibilities should be conducted in tandem with testing and exercises. Meanwhile, efforts should be made to obtain constant feedback and maintain open communication to ensure ongoing improvement of the continuity management system.
Excerpted from, The Resilience Equation by Booz Allen Hamilton.
Key takeaways
- Resilience efforts should be informed that networks are comprised of individual nodes that may operate independently.
- There may be many paths to fail, there are also multiple ways for a network to quickly heal itself.
- Cyber risk factors now plaguing our digital age are missing from the top of corporate strategic agendas.
- Leaders are yet to develop an understanding of how their business is exposed to cyber threats and their impact.
- Organisations have not yet been stung hard enough by cyber-attacks for resilience to been seen as a strategic priority.
- Organisations need to build awareness of resilience and the need for it to be integrated across systems.
- Cyber risk factors need to be integrated within the organisation’s risk management, continuity management and testing systems.
- Testing and exercises provide review of the effectiveness of the internal control system.
- Testing and exercises provide an opportunity to evaluate preparedness in a controlled environment.