Introduction to Database Observability

Grafana Cloud Database Observability provides insights into the performance and health of your MySQL and PostgreSQL databases across self-managed and managed services like AWS RDS. You can understand how your databases behave, how they’re used, and how to optimize them for your applications.

What you’ll learn

In this article, you:

Learn what database observability is and why it matters.
Understand how Alloy, exporters, Mimir, and Loki work together.
Review collected signals and key terms.
Start setup with links to MySQL and PostgreSQL guides.

What is Database Observability

Database observability extends traditional metrics collection with detailed query-level logs. While Prometheus exporters capture aggregate query statistics, Database Observability adds individual query samples, explain plans, and schema details, giving you the context needed to diagnose and optimize specific queries.

Database Observability correlates these logs with metrics in purpose-built dashboards, helping you move from understanding which database is slow to understanding how a specific query is slow because of a missing index.

How it works

Database Observability uses Grafana Alloy to collect telemetry from your databases:

Metrics collection: Database exporters expose metrics from pg_stat_statements (PostgreSQL) or Performance Schema (MySQL). Prometheus scrapes these metrics and sends them to Grafana Cloud.
Log collection: Database Observability components query your database for detailed information including query samples, schema details, and explain plans. This data is forwarded as structured logs to Loki.
Visualization: Grafana Cloud correlates metrics and logs to provide unified dashboards showing query performance, wait events, and optimization opportunities.

┌─────────────┐     ┌─────────────────┐     ┌───────────────┐
│  Database   │────▶│  Grafana Alloy  │────▶│ Grafana Cloud │
│ (MySQL/PG)  │     │  - Exporters    │     │  - Prometheus │
└─────────────┘     │  - DB O11y      │     │  - Loki       │
                    │    components   │     │  - Dashboards │
                    └─────────────────┘     └───────────────┘

Signals collected

Database Observability collects two types of signals:

Metrics

Quantitative data about query execution:

Query counts: Number of times each query executes
Latency: Execution time (average, percentiles)
Errors: Failed query executions
Rows: Rows returned or affected
Resource usage: Lock time, CPU time, blocks read/written

Logs

Structured entries with detailed query information:

Query details: Normalized query text and metadata
Query samples: Individual query executions with timing (SQL text redacted by default)
Schema information: Table structures, indexes, and constraints
Explain plans: Query execution strategies
Wait events: Resource wait information

Key terms

Term	Definition
Query samples	Individual query executions captured with timing metrics. Parameters are redacted by default; disable redaction to see full SQL text.
Wait events	Data showing what resources queries wait for during execution. PostgreSQL captures these from `pg_stat_activity`. MySQL requires `Performance Schema` `events_waits_*` consumers.
Collectors	Alloy component options that enable specific data sources: `query_samples`, `query_details`, `schema_details`, `explain_plans`.
Relabeling	Rules that align labels across logs and metrics. Keep `loki.relabel` and `discovery.relabel` consistent for proper correlation.
Instance label	Identifies your database server. Format: `host:port`. Must match between metrics and logs.
Job label	Prometheus job label, always `integrations/db-o11y` for Database Observability.

Get started

Choose your database type to begin setup:

Set up MySQL: Configure MySQL databases, including AWS RDS, Aurora, Azure, and Cloud SQL.
Set up PostgreSQL: Configure PostgreSQL databases, including AWS RDS, Aurora, Azure, and Cloud SQL.

Next steps

After completing setup:

Monitor: View query performance and find your application’s queries.
Investigate: Analyze explain plans, query samples, wait events, and table schemas.
Optimize: Improve slow queries and get AI-powered optimization suggestions.
Troubleshoot: Resolve setup and configuration issues.