Introducing o11y-bench: an open benchmark for AI agents running observability workflows
Grafana has introduced o11y-bench, an open-source benchmark designed to evaluate the effectiveness of AI agents performing complex observability tasks, such as incident investigati…