© 2024 Kansas City Public Radio
NPR in Kansas City
Play Live Radio
Next Up:
0:00
0:00
0:00 0:00
Available On Air Stations

Amazon And The $150 Million Typo

Amazon says a mistyped command caused a widespread outage in its cloud computing service on Tuesday that disrupted websites across the Internet for hours.
Reed Saxon
/
AP

Amazon says a typo caused its cloud-computing service to fail earlier this week.

On Tuesday, part of Amazon Web Services stopped working. The company's so-called simple storage service, or S3, provides features ranging from file sharing to web feeds.

In an online statement, Amazon described the circumstances of the disruptive typo this way:

"The Amazon Simple Storage Service (S3) team was debugging an issue causing the S3 billing system to progress more slowly than expected. At 9:37AM PST, an authorized S3 team member using an established playbook executed a command which was intended to remove a small number of servers for one of the S3 subsystems that is used by the S3 billing process.

"Unfortunately, one of the inputs to the command was entered incorrectly and a larger set of servers was removed than intended."

The company did not elaborate on what, exactly, the "authorized S3 team member" mistyped, but did say that it took about three hours to get some of the system back up, and more than four hours before the S3 system was back to normal.

The Wall Street Journal reported that the outage "cost companies in the S&P 500 index $150 million, according to Cyence Inc., a startup that specializes in estimating cyber-risks. Apica Inc., a website-monitoring company, said 54 of the internet's top 100 retailers saw website performance slow by 20% or more."

"People reported outages and delays on services like Slack, Trello, Sprinklr, Venmo and even Down Detector, which is the site that shows where real time outages are occurring," reported CNN Money.

The tech site Gizmodo, whose own website was disrupted, reported that the forum site Quora was disrupted.

Even Apple relies on the Amazon system for some of its own cloud services, and parts of its iCloud service were disrupted.

Amazon said it has changed its protocol for the routine, temporary removal of servers from its system so that server capacity is taken offline more slowly, among other safeguards.

"This will prevent an incorrect input from triggering a similar event in the future," the company wrote.

Copyright 2020 NPR. To see more, visit https://www.npr.org.

Rebecca Hersher is a reporter on NPR's Science Desk, where she reports on outbreaks, natural disasters, and environmental and health research. Since coming to NPR in 2011, she has covered the Ebola outbreak in West Africa, embedded with the Afghan army after the American combat mission ended, and reported on floods and hurricanes in the U.S. She's also reported on research about puppies. Before her work on the Science Desk, she was a producer for NPR's Weekend All Things Considered in Los Angeles.
KCUR prides ourselves on bringing local journalism to the public without a paywall — ever.

Our reporting will always be free for you to read. But it's not free to produce.

As a nonprofit, we rely on your donations to keep operating and trying new things. If you value our work, consider becoming a member.