Cyber Continuity Planning: Go Big or Go Home

Local officials and emergency managers know that they, like national governments around the world, have been caught unprepared for how continuity of operations (COOP) really plays out during a pandemic.

Some might take away tactical lessons, like the idea that staff should have access to Zoom or Microsoft Teams. But there is a larger lesson that addresses a fundamental lack of imagination necessary to be ready for the next emergency—an emergency that will be dangerous precisely because it is unexpected. This lesson particularly applies to the continuity planning necessary to keep local government functioning in the face of a wide-spread cyberattack. Specifically: It is dangerous to think small.

Let’s compare planning efforts to emergency management exercise efforts. There are many ways to exercise response plans. For the military, exercises often test the response of forces to a convergence of several incidents at once: a war…and a dust storm…and a solar flare… and so on They intentionally stress the system and the capabilities within it. Historically, emergency management exercises tended to be much humbler affairs, with local, state, regional, and federal stakeholders exercising their existing capabilities. The objective was always, “How well can we do those things we think we can do?” There is value in this type of exercise. Performing them often highlights weaknesses in plans, organization, equipment, and/or training. But they also lack something—a dark imagination that allows organizations to go beyond questions of “How well …?” and into the world of “Can we even…?”

Craig Fugate shook up the “humble affair” exercise paradigm, first as the director of the Florida Division of Emergency Management, and then as the administrator of the Federal Emergency Management Agency. During his time at FEMA, Fugate made clear his desire to “go big or go home” with exercises, stressing the agency beyond its breaking point. He became known for the Thunderbolt exercise series, which portrayed several minor and catastrophic incidents at once. And he expected people to perform. FEMA exercised this approach during events like Cascadia Rising, a 2016 exercise with a scenario that featured a 9.0 magnitude earthquake followed by tsunamis and aftershocks that severely damaged Washington and Oregon. Fugate’s goal was to push the nation’s emergency management system to plan and prepare for risks that exceeded its existing capabilities, and to force his own organization and others to think creatively about how to address the gaps.

This mentality is critical to planning in an uncertain world, one in which we can’t possibly envision all the threats, or the magnitude of their impacts. Pre-9/11, the average American didn’t imagine that a plane could take down a skyscraper. And before the coronavirus, who could envision an environment in which we would work from home for months on end, with our pets and children as our questionable-at-best coworkers?

Cyberattacks on local governments have not attracted the same level of public attention as 9/11 and COVID-19, but they share the same challenge of having to plan against an environment that was previously unimaginable. Before March 18, 2018, city leaders in Atlanta didn’t have plans that spoke specifically to an attacker using ransomware to close down their municipal courts or block access to user accounts for several city services. Baltimore didn’t plan for many of their city services to be out for over five weeks. And cities generally did not expect a cyberattack would take down their 911 systems, until it happened 42 times between 2016 and 2018.¹

Continuity Planning 101

This “go big or go home” mentality has not yet been widely applied to continuity planning, which is the effort to continue mission essential functions despite an incident that affects capabilities. Instead, many jurisdictions have taken a “101” approach, covering the basics of continuity planning to adequately address manageable events like evacuating headquarters because of a gas leak.

These basic planning processes have helped jurisdictions identify the mission essential functions that must be maintained no matter what the incident or event. Some organizations have tied essential functions to the facilities and people needed to conduct those functions, and planned for ways of doing them from elsewhere. Many jurisdictions also identified clear lines of succession in case anyone in the leadership chain was no longer available. And many identified clear triggers and mechanisms for returning back to “normal.” The best continuity plans also had the following characteristics:

• Relevant to location and organization.

• Conscious of the threats that put communities at risk.

• Scalable.

• Broadly inclusive of all sectors: public, private, and non-governmental organizations.

• Trained to and exercised so they could be implemented as intended.

The problem is that the basic continuity planning approach—like the standard emergency management exercise approach—lacks imagination. It falls short not only in considering extreme impacts, but also in its basic assumptions about the environment in which the plans will be enacted. Recent examples of Continuity Planning 101 failures include the response to the COVID-19 pandemic and to many jurisdictions’ cyber incidents.

Continuity During a Pandemic

Continuity Planning 101 assumes that we have enough equipment, supplies, and personnel to perform mission essential functions. However, as anyone who works in the medical field is well aware, the surge can easily overwhelm our supply of critical equipment and supplies. These shortages—including things like personal protective equipment, ventilators, and pharmaceuticals—affect how our medical community can perform their mission essential functions. Government managers today are left scrambling to beg, borrow, steal, or invent their way into meeting the demand.

Furthermore, throughout the spring and into summer, many industries may be faced with a lack of specific personnel to perform mission essential functions. The first round of absenteeism, due to school closures, illness, and taking care of sick family members, is already affecting localities and will continue to do so in the weeks to come. The World Economic Forum predicts a second round of high absenteeism three to six months after COVID-19’s initial impact due to employee burnout and mental and physical fatigue. Both types of absenteeism will impact our ability to complete mission essential functions. And jurisdictions lack the plans to mitigate this crisis because Continuity Planning 101 lacked the imagination to ask, “How would we do this in an environment with 20- to 30-percent fewer staff?”

Continuity During a Cybersecurity Incident

Cyber incidents have likewise challenged basic continuity planning efforts. While many jurisdictions have equipment and strategies to identify, detect, and protect against cyberattacks, far fewer have worked through the hard realities of continuity of their operations through an actual attack. When cyberattacks do strike, many affected jurisdictions find that that their basic COOP plans don’t hold up. While the plans are adequate for transitioning to an alternate work site, they have much less to offer when it comes to working without access to the systems and data staff need to perform their mission essential functions. Existing plans also lack the specificity needed to guide the response, and major stakeholders such as members of the IT staff are often not integrated into response structures. And plans are untested against an exercise or real-world incident.

Traditional COOP plans fall short for continuity during cyber incidents in several ways:

Coordination —including who is involved in decision-making when it comes to ransomware, decisions to take systems offline, and decisions to activate the jurisdictional Emergency Operation Center. Additionally, most local governments do not have clear escalation levels for an attack, nor the associated action items for each level. Response leadership and staffing may also be unclear; the types of subject matter expertise needed to guide a community through a cyber incident are different from those needed for extreme weather.

Communications —including how local managers will rapidly provide information to staff when a cyberattack has severely compromised communications. This includes both immediate communications explaining what to do when the attack begins and ongoing communications, such as how salaries will be provided by payday. Both are critical to response, and yet both may have to be relayed without the benefit of typical emergency communications systems.

Resource requirements —which may be very different from those for a more traditional response. For example, staff might require clean computers, clean servers, local printers, and even fax machines and paper (yes, paper). Importantly, staffing requirements may surge even more during recovery than in response, as departments and agencies struggle to replicate missing or suspect data.

Public information challenges —including public information officers with little to no training or experience in cybersecurity, a lack of pre-scripted messages, and disagreement on when information about the attack should be made public. Providing information to the public is critical because citizens worry about their data and the trustworthiness of key democratic systems such as voting and justice.

Reporting challenges —that can delay response, as outages aren’t reported in a consistent manner, leaving IT departments without the necessary common operational picture to quickly identify an attack. In addition, the lines of reporting are often unclear, and few people outside of IT understand which systems are connected both within and external to the jurisdiction.

Investigation —which is a black box for many localities. The involvement of one or more federal organizations and the limits investigations place on remediation efforts are not well understood or planned for. This makes continuity planning extremely difficult, because there is uncertainty about how and when remediation can even begin.

The private sector —including cyber insurance companies, whose role is opaque to many local officials. Localities have not included private-sector capabilities in their COOP plans, and don’t have a full picture of what area companies could be doing to support their jurisdiction during a cyberattack. COOP plans don’t include how cyber insurance companies can support a response. Nor do they cover the responsibilities the jurisdiction has to its private-sector partners in this type of incident.

While basic COOP plans have furthered local preparedness for many of the threats that jurisdictions face, they fall short during situations that create an environment very different than the norm. As IT departments around the country continue to make systems and data as secure as possible, it is up to emergency managers to stretch their imaginations and plan for cyberattacks that seriously impact critical systems and data. And as COVID-19 has shown us, we need to be prepared for continuity during an extreme paradigm shift that stretches capabilities to the edge. Once this type of planning is completed, it is critical to document it in a COOP annex or within the base plan. It is then incumbent on the emergency management community to regularly train and exercise staff on the new plans so that they are truly prepared to “go big.”

DAWN THOMAS is the co-director of CNA’s Center for Emergency Management Operations, where she has been supporting homeland security planning, training, and exercises for 16 years (thomasdh@cna.org).

Endnote

¹ https://www.nbcnews.com/news/us-news/hackers-have-taken-down-dozens-911-centers-why-it-so-n862206

Topics

Cybersecurity High Performance Organizations

New, Reduced Membership Dues

A new, reduced dues rate is available for CAOs/ACAOs, along with additional discounts for those in smaller communities, has been implemented. Learn more and be sure to join or renew today!

LEARN MORE