Three Key Lessons from Amazon’s Prime Day Performance Problems

By Alex Painter | 8/7/18

Just one hour of downtime cost Amazon an estimated $100 million in lost sales. Unless you were completely off the grid, you’re well aware of the performance issues Amazon and its shoppers experienced on what was touted to be the biggest Prime Day in the company’s history.

Traffic-related slowdowns and outages are a nightmare scenario for any big retailer, but seeing a giant like Amazon fail offers a sobering lesson for online retailers everywhere. Whatever the ultimate cause of Amazon’s woes, it’s a timely reminder if you’re busy preparing for your own peaks around Black Friday and Cyber Monday. Here are three key lessons we can learn from Amazon’s performance problems.

Lesson 1: No one is too big to fail.

If it can happen to Amazon, it can happen to you.

Having a massive infrastructure doesn’t necessarily mean you’re safe from the impact of a surge in visitor numbers. End-to-end system validation is key. A lot depends on your setup — can you add capacity dynamically and scale up fast enough to react to a sudden increase? Some retailers deal with this particular problem by staggering campaigns, rather than having everything land at exactly the same moment.

Lesson 2: Carry out performance testing, but make sure you’re testing the right things.

It’s highly unlikely that Amazon failed to carry out performance tests ahead of Prime Day. But it is possible that what they tested was materially different from what was live on the actual day. Testing the right things — including both happy and non-happy paths — is an organizational challenge as much as a technical one, as different teams race to get everything ready in advance of a big promotion.

Lesson 3: Monitor, monitor, monitor.

Understanding user experience is critical, and you need to be monitoring your site’s performance before, during, and after peak. There are two parts to this. Monitoring Insights, which regularly checks on key pages and user journeys, acts like an early warning system. The minute something goes wrong, you’ll know, which can buy you some precious extra time to identify and rectify the problem before customer complaints start to flood in.

PrimeDayBlog

The issues with Amazon’s website were picked up by Eggplant’s monitoring solutions.

Real Customer Insights, on the other hand, tells you what your customers are actually seeing. It can plot visitor numbers alongside load times and conversions, so you can get a real-time view of how traffic is impacting performance and revenue. Real Customer Insights also reveals which groups of users are most affected by performance problems — they may be concentrated in a certain geographical area. This is useful both when it comes to tracking down problems with a content delivery network (CDN) and managing the impact of slowdowns and outages. For example, you’ll have a much better idea of where to focus your crisis communication.

Monitoring is also useful when it comes to gathering data for next time. It’s probably the last thing on your mind when you’re in the middle of a big promotion, but understanding traffic and performance — and their relationship to business KPIs — during this year’s peak will help you ensure you’re in the best possible position next time around.

Our CTO, Antony Edwards, shared this insight about Amazon’s Prime Day issues:

“Initially, Amazon’s issue felt very much like someone had submitted some code that was causing instability between different systems. My guess is that it was between the core application and the content distribution network. I would have expected Amazon could back out the change and get stability back within 1–2 hours. Either they didn’t know what caused the issue (so their configuration management isn’t that great), or the offending issue caused a downstream tangle that they struggled to get back under control. So, they went off the happy path and things were breaking. Release analytics would probably have prevented this problem — alignment between production and testing is key.”

Find out how Eggplant can help you prepare for peak when it matters most. Watch our recent webinar: "The Seven Deadly Black Friday Sins and How to Avoid Them."

Topics: User Experience, Performance testing, synthetic monitoring, website performance, customer experience optimization, continuous monitoring, Real Customer Insights, performance monitoring, Monitoring Insights, Customer Experience Insights

Alex Painter

Written by Alex Painter

Alex is a technical/performance consultant who helps organizations deliver faster, more reliable online experiences. After graduating in law, Alex worked in marketing for a number of years, eventually moving into web development and web performance, joining Eggplant just recently.

Stay up-to-date with the latest in test automation

Lists by Topic

see all