Reinventing Resiliency for Regulated Public Sector Cloud Solutions

How AWS enables agencies and partners to compliantly leverage the benefits of cloud while staying ahead of growing resiliency demands with AWS GovCloud (US)

By: David Schatzman and Scott Bourn

Technology leaders in the public sector are obligated to safeguard sensitive government data through strict regulatory compliance programs and standards. This is why Amazon Web Services (AWS) launched AWS GovCloud (US) in 2011 to provide public sector leaders operating highly-regulated workloads with compliant cloud technologies to innovate and increase mission effectiveness for the nation’s most critical technology-based services.

Discover the five strategic recommendations for optimizing mission workloads in AWS GovCloud (US) – find the full insights paper here.

Customers obsession

Technology leaders continue to move their high-value and mission-critical capabilities to the cloud with AWS GovCloud (US) chosen as the backbone of their technology operations. They expect high availability and seamless adaptation to events because their mission relies so heavily on technology. When applications fail, they can degrade mission effectiveness and erode business outcomes, sometimes on a national or even global scale. Therefore, these technology leaders asked for guidance, processes, and services to manage resiliency in AWS GovCloud (US).

Continuous resiliency

Resiliency is not a “one-and-done” strategy. Deploying an architecture proven through testing to meet resiliency requirements may be sufficient at a particular point in time. However, needs evolve throughout an application’s lifecycle. Emphasizing resiliency lifecycle management is critical and AWS terms this approach “continuous resiliency.”

AWS created the resilience lifecycle framework to help AWS customers implement continuous resiliency. Technology leaders are encouraged to adopt a resilience lifecycle as an operationalized process. AWS incorporated standard software development lifecycle best practices into the resilience lifecycle to maximize continuous feedback loops across the stages of the lifecycle. Maintaining agility throughout the lifecycle reduces risk, promotes success, enables business outcomes, and increases mission effect.

Architectural resiliency

AWS customers can operate in AWS GovCloud (US-East) and (US-West) Regions simultaneously as a “Multi-site Active-active” pattern for their mission-critical assets. This pattern allows an application to operate in both Regions so that if its performance becomes degraded in one Region, the application continues to operate in the other Region with no interruption of service for its end-users. Users with non-mission critical workloads often use three other AWS architectural patterns (“Backup Restore,” “Pilot Light,” and “Warm Standby”), where a user’s applications are available in one Region and available to a lesser degree in the other Region until needed. AWS recommends customers implement the architectural pattern that aligns with their resiliency requirements and business.

Observability resiliency

AWS launched AWS Resilience Hub in response to users requiring insight into their operational SLA compliance. This service provides a unified framework to manage and test resilience, proactively avoid disruption (such as faulty release, misconfiguration, outage), and provide prescriptive guidance on how to make applications more resilient. The service also provides a dynamic resiliency “score” measured against established SLA targets that provides insight into resiliency “drift.”

AWS also launched AWS Fault Injection Simulator to answer two user questions: “Is my architecture as resilient as I think?” and “How do I know?” Highly-complex workloads need advanced and ongoing testing to predict and mitigate the risk of failure. Continuous fault injection experiments, chaos engineering, enables a user to create the operational situations needed to uncover hidden bugs, monitor blind spots, and manage bottlenecks.

AWS recommends users evaluate the merit of deploying these two resiliency-based services as part of their observability and continuous resilience strategies.

Region design

Users often ask how the AWS Regions support resiliency. AWS GovCloud (US) is comprised of two isolated and U.S. sovereign Regions on U.S. soil. Although they are available from the public internet and AWS Direct Connect, they are isolated from the balance of the AWS global footprint and operated independently by U.S. Citizens. AWS Regions contain three Availability Zones (AZs) comprised of data centers with redundant power, networking, and connectivity. AZs are located far enough apart from each other to reduce the risk of a single event impacting availability, yet near enough to enable synchronous replication, rapid failover, and low latency. This Region design helps make sure that applications are protected against disruptions, such as human mistakes, unexpected traffic spikes, utility failures, earthquakes, and weather events.

Long-term customer obsession

AWS has invested in resiliency for AWS GovCloud (US) for more than a decade.  “Great operational performance is the result of a long-term commitment and an accrual of small decisions and investments compounding on top of each other,” says James Hamilton, vice president and distinguished engineer at AWS. “There are no shortcuts. The AWS Global Cloud Infrastructure is the most secure, extensive, and reliable cloud platform because of our unrelenting pursuit of potential failure points, continuous innovation, and culture of continuous improvement. That’s how we stay ahead of growing resiliency demands so that our users can always be there for their users.”

Learn more about resiliency

AWS encourages technology leaders to think big, innovate, and use AWS GovCloud (US) as a strategic technology enabler to drive mission outcomes. To help our user dive-deep into resiliency, AWS created an insight paper containing five strategic recommendations for mission workloads in AWS GovCloud (US) that is available here.

This content is made possible by our sponsor Amazon Web Services; it is not written by and does not necessarily reflect the views of NextGov/FCW’s editorial staff.

NEXT STORY: Navigating AI’s role in mitigating ransomware threats