TOP NEWS

Amazon: Human Error Took Amazon S3 Offline

Seattle-based Amazon Web Services has released a detailed breakdown of the cause behind the widespread outage of its cloud storage service, Amazon S3, on Tuesday, saying that human error was behind the failure. The downtime for Amazon S3 resulted in a very large number of websites and cloud services to become unavailable, and exposed how widely Amazon's cloud services are to such services as Slack and even the U.S. Securities and Exchange Commission. According to Amazon, around 9:37 AM PST, one of its team members used an "established playbook" to execute a command, intended to remove "a small number of servers" from the S3 subsystems used for its billing processes. However, Amazon said one of the inputs to that command was "entered incorrectly"--resulting in the loss of all S3 objects in that S3 region--that required a full restart of AWS services in the US-EAST-1 Region. That restart took hours, due to metadata checks done whenever S3 instances are restarted--something, apparently, which Amazon has been unable to do in a long time, due to rapid scaling in its services.


LATEST HEADLINES

More Headlines

BROWSE ISSUES