Microsoft services restored after outage

Office 365 and Copilot bought down by config error

A Microsoft Azure outage last night caused problems around the world for workers and gamers.

The outage in Azure disrupted Office 365, Outlook and Copilot and Xbox Live and Minecraft. Starbucks and Costco apps were affected as was Heathrow Airport, with disruption continuing into the evening.

Down detector saw a spike in problems with Azure connected websites and services from around 3.30pm.

As was the case with the recent AWS outage last week, the problem was linked to a DNS issue. Microsoft updated the Azure service page last night to confirm that a configuration change and DNS failures in Azure Front Door had the unfortunate effect of blocking users from reaching their Azure-hosted apps, sites and services.

It said:

“An inadvertent tenant configuration change within Azure Front Door (AFD) triggered a widespread service disruption affecting both Microsoft services and customer applications dependent on AFD for global content delivery. The change introduced an invalid or inconsistent configuration state that caused a significant number of AFD nodes to fail to load properly, leading to increased latencies, timeouts, and connection errors for downstream services.”

Microsoft stabilised the problem quickly by basically restoring the previous configuration and restored services gradually to avoid overload in certain nodes.

The outage lasted around five hours from start to finish.

There were also reports of problems with AWS as well, although AWS said that it was operating normally.

The Microsoft outage seems to have inflicted less damage than the AWS outage on 20th October which affected thousands of popular sites and applications and took longer to resolve.

Both outages were linked to DNS failures but differ in that AWS’s problems stemmed from a software bug in their control plane, whereas Azure’s appears to be the result of a human error in a configuration update.

What the outages share is their effectiveness at demonstrating the risks of relying on three companies to basically run the global internet.

Commenting on the outage, Raphael Auphan, COO at Proton, said, "For the second time in two weeks, we've seen a massive portion of the internet taken offline thanks to the mistakes of a solitary tech giant. As if we needed reminding, this is further proof that relying on a handful of major cloud providers creates serious vulnerabilities across the internet and puts whole economies at risk in the process.

“The only answer for the UK, Europe, and elsewhere is to prioritise digital sovereignty, in other words, to develop their own native services. We need to stand on our own two feet if we're going to have any chance in the future."

Mark Boost, CEO of Civo, said: “The concentration of cloud power among a handful of US hyperscalers creates fragility at the heart of our economy. A single configuration error outside our borders shouldn’t be able to ground flights at Heathrow or disrupt parliamentary systems in Scotland. Whilst, local hosting is essential for sovereignty, it does not eliminate outages; the real systemic risk is the fragility caused by concentration.

"The past week has made one thing clear: resilience cannot come from dependency. We need to invest in a diversified, domestically governed cloud strategy and ensure critical applications are distributed across more than just the top three providers. Policymakers must rethink procurement, fund sovereign alternatives, and make resilience a baseline requirement - not an afterthought.”