The Little Red Button: What Happens to Amazon When a Blizzard Hits

Amazon's supply chain is a marvel of algorithmic precision — until a blizzard shuts down New York. What happens next reveals something important about how the world's most sophisticated logistics network actually holds together: not with AI, but with a pager, a sleep-deprived team, and something called the Little Red Button.

The Second Blizzard

New York's second snowstorm of the season started on a February afternoon and intensified through the evening. By 8 PM, every phone in the city buzzed with an emergency alert from the mayor's office: all non-essential vehicles were banned from the roads within the hour.

New York residential street buried in snow during blizzard

New York City during the blizzard — cars buried, streets empty. The mayor's travel ban went into effect within hours of this photo being taken.

By morning, the city had gone quiet in that particular way that only a heavy snowstorm can produce — the kind of silence that feels like borrowed time. Walking through it, crunching through unplowed streets, I found myself thinking about a very similar night years earlier, when I was alone at my desk with a pager I didn't know how to use and a blizzard bearing down on the entire eastern seaboard.

The Job Nobody Wanted at 3 AM

The North American S&OP team I worked on was responsible for inbound and outbound planning across Amazon's entire US fulfillment network. The team ran 24/7, which meant the team ran on-call rotations. Every Friday at the end of the workday, whoever was finishing their week handed a pager to whoever was starting theirs — along with system access — and that person became the sole decision-maker for the entire network until the following Friday.

My first week on rotation, I opened the pager and realized I had absolutely no idea how to turn it on. I posted to every group chat I could find. No response. I did what anyone would do: set a phone alarm for every 15 minutes and checked the monitoring dashboard manually, all night, every night.

TITAN pager — the on-call device for Amazon S&OP network management

"Does anyone still remember how to use one of these?" — The on-call pager that ran the most critical network in US retail in Feb 2015.

What LRB Actually Was

The system had a mechanism for managing fulfillment center outages called the LRB — the Little Red Button. When a warehouse's order processing metrics crossed a defined threshold, indicating that incoming orders were outpacing the center's ability to fulfill them, the on-call person could trigger the LRB for that location. Doing so rerouted the center's incoming orders to other fulfillment centers in the region.

It was, visually, exactly what it sounds like: a red button on the monitoring interface. It could only be triggered by the on-call person and a very small handful of others, and every trigger required a written record. The reason for the restriction was obvious once you understood the consequences: pressing that button did not fix a problem locally. It transferred load to neighboring centers — which might then hit their own thresholds.

The first stone thrown into a calm lake sends out ripples. Press LRB in New York, and the ripple moves outward to New Jersey, then Pennsylvania, then the whole Northeast corridor. Every button press was a decision with cascading downstream effects across the entire US network.

The Night the Whole East Coast Went Down

On my very first on-call shift — the one where I still didn't know how to work the pager — the blizzard hit. A New York warehouse reported first, metrics crossing the alert threshold. I pressed LRB. Before that center had recovered, a second New Jersey warehouse reported. LRB again. Then the whole New York metro area was effectively down, its load cascading into neighboring states whose centers were already running reduced capacity due to the weather.

Each LRB press shifted load further west and south. Centers that hadn't been in distress before began absorbing the redirected volume and then hit their own limits. By the third day, the last line of defense was the West Coast. The only mechanism left was the BRB — the Big Red Button — which would shut down order intake across the entire national network simultaneously. That had happened, by internal account, exactly once in Amazon's history.

Fortunately, it didn't happen a second time on my watch. The weather broke. East Coast capacity recovered slowly, then faster. By the time I handed off the pager at the end of the week — looking considerably worse for wear — the network had normalized. My colleagues in the S&OP team offered congratulations: I had apparently set a record for the most LRB triggers in a single rotation.

Empty snow-covered New York street during blizzard

The morning after — streets silent, network in cascade. Each LRB press sent ripples across the entire US fulfillment network.

Others Did the Same

My situation was not unique. A colleague in Seattle survived a citywide power outage by sitting in her car all night, phone tethered for internet access, monitoring the dashboard by the glow of her car's interior light. Another colleague broke his leg and, from his hospital bed while a cast was being applied, was coordinating shift swaps with teammates.

The people who kept the network running in those moments were not algorithms. They were ordinary people doing something extraordinary with imperfect tools, very little sleep, and a genuine sense of responsibility for millions of orders that had to keep moving.

The Mistake, and What Came After

I made an error during that first rotation — an oversight that kept one fulfillment center offline for an additional 48 hours beyond when it should have recovered. The financial impact was real. I was asked to write a COE: a Correction of Error, Amazon's internal document for analyzing what went wrong and proposing structural improvements.

I used the process honestly. Rather than writing a minimal compliance document, I worked with several colleagues to catalog every gap we could find — in the monitoring logic, the tooling, the handoff process, the on-call training. We produced a set of proposals, most of which eventually got implemented.

About six months after my first rotation, the pager was replaced by an internal mobile app. A remote team in India took over the overnight monitoring. A couple of years after that, the S&OP team no longer ran on-call rotations at all. By now, the entire process is almost certainly fully automated.

What that arc illustrates — from human on-call to remote support to full automation — is not just Amazon's story. It is a template for how operational systems evolve when you apply consistent pressure toward reliability, cost reduction, and scale. Manual processes reveal their fragility under stress. That stress becomes the specification for the automated replacement.

What This Means for Sellers

The LRB story is a useful reminder that Amazon's network, for all its technological sophistication, was — and in some ways still is — more human-dependent than it appears from the outside. When large disruptions occur (blizzards, port strikes, peak season surges), the system does not fail gracefully or invisibly. It redistributes load, creates delays, and occasionally produces inventory and order anomalies that sellers experience as unexplained stockouts, delayed receiving, or erratic replenishment.

Understanding that those anomalies often have systemic causes — and are not errors in your own shipments or listings — is part of what it means to operate on the platform with clear eyes. Build buffer stock. Plan for inbound delays during weather events and peak periods. Don't assume that "available" inventory in the system reflects what is physically stowed and ready to fulfill. The network is remarkably capable. It is not infallible.

Part of the series: Inside Amazon's Supply Chain