2 hours ago

Cloudflare suffers major global outage after Bot Management file error

Cloudflare

Cloudflare, one of the world’s largest internet infrastructure providers, experienced a major global outage on 18 November 2025, leaving millions of users unable to access websites, APIs, and applications protected by its network. The disruptions began at 11:20 UTC, with users encountering Cloudflare-branded HTTP 5xx error pages indicating failures within the company’s core routing systems.

What initially looked like the early stages of a massive DDoS attack turned out to be something far more mundane — and far more preventable: a flawed database permissions update that triggered a cascading internal failure.

Cloudflare confirmed that the incident was not caused by malicious activity. Instead, the root of the issue was a change to permissions inside one of its ClickHouse database clusters. This change caused the system to generate duplicate entries in a critical “feature file” used by Cloudflare’s Bot Management machine learning engine.

The file — normally refreshed every few minutes and distributed worldwide — unexpectedly doubled in size. When propagated across Cloudflare’s massive network, the oversized file exceeded memory limits within the company’s core proxy service, causing it to fail catastrophically.

As a result, Cloudflare’s global edge servers began returning waves of HTTP 5xx errors, interrupting everything from website access to authentication systems.

The nature of the failure made the outage unusually difficult to diagnose. Because the database cluster was being updated in phases, only some nodes produced bad configuration files — while others generated valid ones. This unpredictable oscillation initially led Cloudflare’s teams to suspect a hyper-scale DDoS attack, especially when — by coincidence — Cloudflare’s externally hosted status page also went offline.

Between 11:20 and 17:06 UTC, the outage disrupted multiple Cloudflare services: Some customers on Cloudflare’s older proxy engine (FL) saw incorrect bot scores instead of full outages, potentially triggering false-positive bot blocks.

At 14:30 UTC, Cloudflare engineers identified the faulty feature file as the source of the problem. They halted propagation of the corrupted data and manually injected a known-good file into the distribution pipeline.

The team then forced restarts of the frontline proxy services across the global network. Full restoration required additional hours as backlog traffic surged, placing heavy load on observability and debugging systems.

Cloudflare reported that by 17:06 UTC, all systems had returned to normal.

In a candid post-event statement, Cloudflare CEO Matthew Prince acknowledged the severity of the disruption: “Any outage of any of our systems is unacceptable. There was a period where our network could not route core traffic. We know we let you down today.”

The company described the event as its worst outage since 2019.

Cloudflare says it has already begun implementing architectural safeguards to prevent similar incidents, including:

  • Hardened validation for all internal configuration files
  • Additional global kill switches for rapid feature disablement
  • Improved failure handling across proxy modules
  • Better resource management to avoid debug overload
  • New guardrails around distributed database queries

 

As a cornerstone of the global web, Cloudflare’s infrastructure sits between users and millions of websites. Outages of this scale ripple across the entire internet ecosystem — from online banking to retail, gaming, and enterprise SaaS.

The November 18 outage highlights both the fragility of interconnected systems and the need for extreme caution when pushing changes into hyperscale distributed architectures.

Cloudflare has promised a deeper investigation and further updates as it strengthens its systems to ensure such a failure “will not happen again.”

Leave a Reply

Don't Miss

Bashar Bashaireh, AVP for the Middle East, Türkiye & North Africa at Cloudflare

GITEX 2025: Bashar Bashaireh of Cloudflare on network costs, data residency and AI at the edge.

In this GITEX 2025 interview, Bashar Bashaireh, AVP for the Middle East,
Bashar Bashaireh, AVP Middle East, Türkiye & North Africa at Cloudflare

Cloudflare at GITEX 2025: Advancing secure, AI-Driven connectivity in the Middle East

Cloudflare  has announced its participation at GITEX GLOBAL 2025 (October 13-17, Dubai

Welcome to

By signing or creating an account you agree with our Code of conduct & Privacy policy