Grafana Cloud

Introduction to Grafana Cloud

Grafana Cloud is a managed service that removes the need to deploy, scale, upgrade, or operate your own observability stack. It’s designed for high performance at any size. Grafana Cloud gives you a complete, flexible platform for collecting, connecting, analyzing, and acting on your observability data.

Introduction to observability

If you need a more basic understanding of observability, refer to:

Collect, send, and connect data

Grafana Cloud gives you flexibility in how you ingest data. Send telemetry using Alloy or other methods, over either public or private networks. You can also connect externally hosted data sources and use Grafana Cloud to visualize and query them.

Collect

  • Grafana Alloy. Collect and send logs, metrics, traces, and profiles to Grafana Cloud using the Alloy collector, which is a distribution of OpenTelemetry. Alloy ensures reliable, enriched, and cost-efficient telemetry pipelines for production environments. To learn more, refer to the Introduction to Grafana Alloy.

  • Adaptive telemetry. Adaptive telemetry, a core feature of Grafana Cloud, reduces noise and cost by automatically prioritizing data with the highest diagnostic value. It controls volume without sacrificing insight—identifying unused metrics, filtering out low-value logs, and retaining only the most useful traces during anomalies. To learn more, refer to the Adaptive Telemetry.

  • OTLP endpoint. Grafana Cloud exposes an OTLP-compliant endpoint, which allows simple ingestion from OTel-instrumented applications without deploying a collector. This is ideal for development, testing, and small workloads. To learn more, refer to Send data to the Grafana Cloud OTLP endpoint.

Manage collectors

  • Fleet Management. Onboard, configure, and monitor collectors at scale. Fleet management gives teams consistent, centralized control over observability pipelines across large environments. To learn more, refer to Fleet Management.

Send to Grafana Cloud

  • Sending metrics, logs, traces, and profiles. Send any telemetry type to Grafana Cloud using the method that best fits your environment. That way you can integrate quickly and choose how data flows into your environment. To learn more, refer to Send metrics to Grafana Cloud.

  • Private network options. Send telemetry over AWS PrivateLink or Azure Private Link instead of the public internet. This reduces exposure and meets stricter compliance or networking requirements. To learn more, refer to Azure PrivateLink, AWS PrivateLink, or GCP Private Service Connect,

Connect external data sources to Grafana Cloud

Not all data needs to live in Grafana Cloud. You can connect to externally hosted systems and query them in place. This keeps your data where it already resides, but allows you to visualize and analyze it alongside your other telemetry.

  • External data sources. Connect to publicly reachable or third-party systems and query them directly from Grafana Cloud. Visualize and analyze external data without moving it into your stack. To learn more, refer to Add and configure data sources.

  • Private Data Source Connect (PDC). Grafana Cloud connects to systems inside your VPC or VNet, letting you visualize and query private data sources without exposing them to the public internet. To learn more, refer to Private data source connections in Grafana Cloud.

Use observability solutions for faster time-to-value

Solutions in Grafana Cloud offer a more opinionated and guided experience, along with preconfigured dashboards and panels. Some solutions offer knowledge graph technology and the AI assistant.

  • Knowledge Graph. A built-in cloud service feature, Knowledge Graph runs as part of your Grafana Cloud stack. It’s enabled at the stack level through your Application Observability or Kubernetes Monitoring subscription.

As an underlying engine, it maintains a live inventory of services, pods, nodes, workloads, and dependencies. When something goes wrong, you immediately see the surrounding context: why it happened, where it occurred, and what else is affected. To learn more, refer to the Use the knowledge graph.

  • Kubernetes Monitoring. This solution collects and visualizes health, performance, and resource cost, from clusters to containers. Knowledge graph technology and preconfigured dashboards are built in. You can gain an immediate, end-to-end understanding of your Kubernetes environment. To learn more, refer to Kubernetes Monitoring.

  • Cloud Provider Observability (AWS/Azure/GCP). Monitor cloud services that manage Kubernetes environments in a single interface, including AWS, Azure, and GCP. With provider-specific setup and RBAC built in, you get centralized visibility across all your cloud accounts and services—making. This makes it easier to track usage, performance, and health consistently, even in complex multi-cloud setups. To learn more, refer to Cloud Provider Observability.

  • Application Observability. Monitor backend applications and services with this OpenTelemetry-based APM solution. Built-in knowledge graph technology and preconfigured dashboards give you a clear picture of service behavior and dependencies—no manual assembly required. To learn more, refer to Monitor applications.

  • Frontend Observability. Instrument web apps to visualize real-time performance, and correlate with backend/infrastructure data. Understand how real users experience the application, and learn how frontend issues link to backend causes. To learn more, refer to Frontend Observability.

  • Database Observability. Understand MySQL/PostgreSQL database health and performance at a glance with this overview and troubleshooting solution. You can diagnose bottlenecks and query issues without digging through logs or raw metrics. To learn more, refer to Database Observability.

  • AI Observability. Monitor and optimize your entire AI stack with this solution. Gain visibility into AI system behavior, insight into output quality, and control over usage and cost. To learn more, refer to AI Observability.

Customize visualizations and queries

Grafana Cloud includes built-in tools for creating tailored views and custom queries that reflect your architecture, KPIs, and workflows.

  • Dashboards & panels. Build custom dashboards to visualize your data exactly how you need it. To learn more, refer to Dashboards in Grafana.

  • Query-less workflows. Troubleshoot using guided drill-downs to find answers without writing queries. To learn more, refer to Simplified exploration.

Test and validate before users feel it

You can proactively ensure the reliability and performance of your applications by using built-in tools to continuously test, validate, and monitor how your systems behave in real-world conditions.

  • Synthetic Monitoring. Run continuous, end-to-end checks that can alert you the moment something breaks. Black‑box checks from global or private probes detect availability and performance issues before your users notice them. To learn more, refer to Synthetic Monitoring.

  • Performance testing. Validate how your system behaves under real-world load. Create and run load tests, analyze results (including comparisons), and create recommendations. Find bottlenecks, prevent regressions, and send changes confidently, knowing your application will scale. To learn more, refer to Performance testing with Grafana Cloud k6.

Alert, measure reliability, and respond

The Alerts and IRM suite of products help teams detect, respond to, and learn from incidents with minimal effort.

  • Grafana Alerting. Get notified the moment problems occur so you can respond quickly and reduce downtime. Define alert rules across multiple data sources and route notifications where they need to go. To learn more, refer to Alerting in Grafana.

  • SLOs. Track and maintain the reliability your users expect. Spot service quality issues before they become major incident. To learn more, refer to Service-level objectives in Grafana Cloud.

  • Grafana IRM. Streamline incident response by coordinating alerts, on-call schedules, and incident timelines in one place. Your recovery can be faster and more organized. To learn more, refer to Incident and response management.

Plan and troubleshoot with AI and machine learning

Grafana Cloud includes intelligent, built-in analytics and AI-powered tools. That means you can spot trends for proactive response and planning, detect issues earlier, understand incidents faster, and get gain insight from your data quickly.

  • Sift investigation. Built into Grafana Cloud, this feature surfaces the most relevant signals and patterns during an incident. It runs automated investigations on your telemetry, and curates the results. To learn more, refer to Sift.

  • Forecasts & Outlier detection. Use forecasts to predict future trends and plan capacity ahead of time. Outlier detection flags unusual behavior and emerging issues before they become problems. To learn more, refer to:

  • Grafana Assistant. Interact with your observability data using natural language to speed up troubleshooting and remove the need to write complex queries. Anyone on the team can get insights from their data quickly. To learn more, refer to Grafana Assistant.

Manage security, access, and cost

Grafana Cloud provides enterprise-grade security, data isolation, and 24/7 reliability, while providing full visibility into your usage and costs.

  • Access policies and RBAC. Grafana Cloud includes built-in security features that protect access to your data and resources, full RBAC, and a single place to manage the operational components of your account. To learn more, refer to Security and account management.

  • Cloud API. Automate the creation, configuration, security, and maintenance of your Grafana Cloud accounts and environments. This reduces manual work and ensures consistency. To learn more, refer to the Grafana Cloud API.

  • Cost Management & Billing. Monitor cloud usage spending, and invoice details from a single dashboard. Set alerts for rising costs and view detailed billing breakdowns. To learn more, refer to Grafana Cloud billing and usage.

Define settings and configuration with as code

Define dashboards, alerts, data sources, and other observability settings as version-controlled files. This lets you apply standard development practices, to ensure consistent, reliable deployments while eliminating manual configuration drift.

  • Observability as code Define and manage Grafana dashboards, alerts, and data sources in code rather than through the UI. Store them in version control and deploy them via automated pipelines alongside your application code. To learn more, refer to Observability as code.

  • Infrastructure as code. Integrate your observability configuration into broader infrastructure-as-code workflows using tools like Terraform or Ansible. Review, test, validate, and deploy observability resources as part of your CI/CD or GitOps pipelines. To learn more, refer to Infrastructure as code.