Apr 04, 2026 · Written by

The Deploy That Broke Our Hero Section for 63 Hours

Case Study

We pushed a one-line CSS change on a Friday afternoon. Swapped a flexbox direction on the hero section, merged it, deployed. Our uptime monitor pinged back green. Status page: all clear. CI pipeline: all tests passed. I think someone even said "cleanest deploy we've had in weeks" in Slack. We closed our laptops and left for the weekend.

By Monday morning, three customers had emailed support. The hero section on our landing page was completely broken. Headline stacked under the CTA button, background image clipped at half height. It looked like a half-loaded page from 2005. The kicker? Our checkout form had the same issue. We'd been losing conversions for 63 hours and had no idea. We checked our analytics later and confirmed a 40% drop in signups over the weekend. That's not a rounding error.

What Actually Went Wrong

The fix itself was one line. A flex-direction: column that should've been scoped to a mobile breakpoint got applied globally. Functionally, everything worked: buttons clicked, forms submitted, payments processed. Every automated test passed because our tests check behavior, not appearance. But visually, the page was wrecked. No alert fired because nothing was technically "down." That's the gap we didn't know we had: the space between "the server responds" and "the page actually looks right." Look, uptime monitoring answers one question. It doesn't tell you whether the layout collapsed or if your images are loading at all.

What We Changed After That

We started capturing visual diffs of key pages after every deploy. Not full-blown visual regression testing baked into CI, just automated screenshots compared against a baseline. We picked five critical screens: the homepage hero, pricing, the signup flow, checkout, and our main feature page. If any element shifts by more than a few pixels, we get a change alert before any customer does. The whole setup took about fifteen minutes. We configured captures to run immediately after each deploy via a webhook, then again thirty minutes later to catch any delayed rendering issues.

That Friday incident cost us more in lost signups than the entire year's monitoring cost. It would've been caught in minutes, not days. If you're shipping regularly and relying only on uptime checks, you've got the same blind spot we had. We wrote up the whole setup (what we screenshot, how often, what thresholds we use) on our post-deployment monitoring page if you want to steal the playbook.

Start archiving websites today

Free plan includes 3 websites with daily captures. No credit card required.

Create free account