Grafana Cloud

Introduction to Database Observability

Grafana Cloud Database Observability provides insights into the performance and health of your MySQL and PostgreSQL databases across self-managed and managed services like AWS RDS. You can understand how your databases behave, how they’re used, and how to optimize them for your applications.

What you’ll learn

In this article, you:

  • Learn what database observability is and why it matters.
  • Understand how Alloy, exporters, Mimir, and Loki work together.
  • Review collected signals and key terms.
  • Start setup with links to MySQL and PostgreSQL guides.

What is Database Observability

Database observability extends traditional metrics collection with detailed query-level logs. While Prometheus exporters capture aggregate query statistics, Database Observability adds individual query samples, explain plans, and schema details, giving you the context needed to diagnose and optimize specific queries.

Database Observability correlates these logs with metrics in purpose-built dashboards, helping you move from understanding which database is slow to understanding how a specific query is slow because of a missing index.

How it works

Database Observability uses Grafana Alloy to collect telemetry from your databases:

  1. Metrics collection: Database exporters expose metrics from pg_stat_statements (PostgreSQL) or Performance Schema (MySQL). Prometheus scrapes these metrics and sends them to Grafana Cloud.

  2. Log collection: Database Observability components query your database for detailed information including query samples, schema details, and explain plans. This data is forwarded as structured logs to Loki.

  3. Visualization: Grafana Cloud correlates metrics and logs to provide unified dashboards showing query performance, wait events, and optimization opportunities.

┌─────────────┐     ┌─────────────────┐     ┌───────────────┐
│  Database   │────▶│  Grafana Alloy  │────▶│ Grafana Cloud │
│ (MySQL/PG)  │     │  - Exporters    │     │  - Prometheus │
└─────────────┘     │  - DB O11y      │     │  - Loki       │
                    │    components   │     │  - Dashboards │
                    └─────────────────┘     └───────────────┘

Signals collected

Database Observability collects two types of signals:

Metrics

Quantitative data about query execution:

  • Query counts: Number of times each query executes
  • Latency: Execution time (average, percentiles)
  • Errors: Failed query executions
  • Rows: Rows returned or affected
  • Resource usage: Lock time, CPU time, blocks read/written

Logs

Structured entries with detailed query information:

  • Query details: Normalized query text and metadata
  • Query samples: Individual query executions with timing (SQL text redacted by default)
  • Schema information: Table structures, indexes, and constraints
  • Explain plans: Query execution strategies
  • Wait events: Resource wait information

Key terms

TermDefinition
Query samplesIndividual query executions captured with timing metrics. Parameters are redacted by default; disable redaction to see full SQL text.
Wait eventsData showing what resources queries wait for during execution. PostgreSQL captures these from pg_stat_activity. MySQL requires Performance Schema events_waits_* consumers.
CollectorsAlloy component options that enable specific data sources: query_samples, query_details, schema_details, explain_plans.
RelabelingRules that align labels across logs and metrics. Keep loki.relabel and discovery.relabel consistent for proper correlation.
Instance labelIdentifies your database server. Format: host:port. Must match between metrics and logs.
Job labelPrometheus job label, always integrations/db-o11y for Database Observability.

Get started

Choose your database type to begin setup:

  • Set up MySQL: Configure MySQL databases, including AWS RDS, Aurora, Azure, and Cloud SQL.
  • Set up PostgreSQL: Configure PostgreSQL databases, including AWS RDS, Aurora, Azure, and Cloud SQL.

Next steps

After completing setup:

  • Monitor: View query performance and find your application’s queries.
  • Investigate: Analyze explain plans, query samples, wait events, and table schemas.
  • Optimize: Improve slow queries and get AI-powered optimization suggestions.
  • Troubleshoot: Resolve setup and configuration issues.