2023-03-08 Incident: Infrastructure connectivity issue affecting multiple regions
Datadog | The Monitor blog

2023-03-08 Incident: Infrastructure connectivity issue affecting multiple regions


Summary

On March 8-10, 2023, Datadog experienced a multi-regional outage affecting platform access, monitoring, and data ingestion. The incident was triggered by an automatic systemd update that interfered with network routing, impacting tens of thousands of nodes. Datadog resolved the issue by disabling the problematic update channel, restoring compute capacity, and backfilling historical data, utilizing a large incident response team and extensive communication with customers.
Read the Original Article

This article originally appeared on Datadog | The Monitor blog.

Read Full Article on Original Site

Popular from Datadog | The Monitor blog

1
Datadog achieves ISO 42001 certification for responsible AI
Datadog achieves ISO 42001 certification for responsible AI

Datadog | The Monitor blog Mar 26, 2026 28 views

2
Understand session replays faster with AI summaries and smart chapters
Understand session replays faster with AI summaries and smart chapters

Datadog | The Monitor blog Apr 2, 2026 25 views

3
Introducing Bits AI Dev Agent for Code Security
Introducing Bits AI Dev Agent for Code Security

Datadog | The Monitor blog Mar 26, 2026 22 views

4
Analyzing round trip query latency
Analyzing round trip query latency

Datadog | The Monitor blog Mar 27, 2026 20 views

5
Platform engineering metrics: What to measure and what to ignore
Platform engineering metrics: What to measure and what to ignore

Datadog | The Monitor blog Apr 9, 2026 19 views