While the federal government continues its efforts to expand into cloud and innovative technologies, the Amazon Elastic Compute Cloud (EC2) crash of last week reminds us of the need to "innovate but verify" as we move to the next realm of technologies.
Today, Amazon issued a 5,700 word explanation of went wrong. The verdict -- the crash was "caused by several root causes interacting with one another." In other words, things didn't work right. A quick lowdown in twitter-esque wording of what happened on April 21:
- Amazon tries to upgrade capacity in northern Virginia regional network storage, traffic rerouted
- Oops. Goes to backup instead of main network
- Too much traffic, causes clogging and cutoff
- Amazon fixes but storage area tries to back up data
- "Re-mirroring storm" equals dreaded computer "hourglass of doom," we all know
- Chaos spreads
- Affected users get 10 free days
Amazon is now assessing the EC2 structure, saying "this event has taught us that we must make further investments" to realize a better design goal.
In the end, sometimes a crash means having to say you're sorry. The lesson for the government: Innovation demands that "new" technology be tested and verified to truly be cutting edge and a trusted tool in government-citizen interactions.

Continuous Monitoring As a Service: A Shift in the Way Government Does Business
Research Report: Powering Continuous Monitoring Through Big Data
Addressing the 3 Biggest BYOD Security Threats
Mobile Apps: New Ways to Connect Government with Citizens
JOIN THE DISCUSSION
By using this service you agree not to post material that is obscene, harassing, defamatory, or otherwise objectionable. Although Nextgov does not monitor comments posted to this site (and has no obligation to), it reserves the right to delete, edit, or move any material that it deems to be in violation of this rule.