Sampling strategies for tracing

Sampling controls trace volume and cost by selecting which traces to retain. This page explains common strategies and how to configure them with Grafana Alloy or the OpenTelemetry Collector.

Sampling helps you control ingestion and storage costs. You can focus on high-value traces, such as errors, latency outliers, specific tenants/endpoints.

There are two main sampling strategies: head sampling and tail sampling.

Head sampling: decide at span start; low overhead; may miss rare errors/latency.
Tail sampling: decide after collecting spans for a period; can select based on outcome (status, duration, attributes).

Refer to Sampling and policies in the Tempo documentation for more information.

Common tail-sampling policies

Latency-based: keep traces with duration above a threshold.
Error-based: keep status_code != OK or HTTP 5xx.
Attribute-based: keep critical tenants, endpoints, or transaction types.
Probabilistic: sample a percentage for baseline coverage.

Combine policies to ensure broad coverage plus targeted retention of valuable traces.

Configuration references

Grafana Alloy tail sampling: Tail sampling in Alloy
OpenTelemetry Collector tail sampling: Tail sampling processor

Best practices

Decision wait period: ensure it fits typical trace durations; too long can delay spans (affects metrics-generation slack).
Batch timeouts/size: large buffers add latency; tune alongside sampling.
Composite/AND samplers: use to require multiple conditions; avoid unintentionally dropping most traces.
Span dropping vs trace sampling: span filtering can reduce noise without dropping the entire trace.