Distributed Tracing
Telegen provides zero-configuration distributed tracing using eBPF.
Overview
Telegen automatically traces:
HTTP/HTTPS - All HTTP/1.1 and HTTP/2 traffic
gRPC - All gRPC calls
Database queries - PostgreSQL, MySQL, MongoDB, Redis
Message queues - Kafka, RabbitMQ
Internal function calls - For supported runtimes
No code changes or SDK integration required.
Tip
For targeted tracing, use port-based discovery to instrument only specific services. See Auto-Discovery for details.
discovery:
instrument:
- open_ports: "8080-8089" # Only trace these ports
How It Works
flowchart TB
subgraph Kernel["Linux Kernel"]
K["eBPF Programs"]
end
subgraph App["Application"]
A["HTTP Handler"]
B["gRPC Client"]
C["DB Query"]
end
subgraph Telegen["Telegen Agent"]
T["Trace Correlator"]
E["OTLP Exporter"]
end
K -->|"Intercept"| A
K -->|"Intercept"| B
K -->|"Intercept"| C
A --> K
B --> K
C --> K
K --> T
T --> E
E -->|"OTLP"| OC["OTel Collector"]
Trace Context Propagation
Telegen automatically extracts and propagates trace context:
Incoming requests - Extract
traceparent/tracestatefrom headersOutgoing requests - Inject trace context into outgoing calls
Cross-service correlation - Link spans across service boundaries
Protocol Support
HTTP Tracing
# Automatically captured for every HTTP request
span:
name: "GET /api/users/{id}"
kind: SERVER
attributes:
http.method: GET
http.url: "https://api.example.com/api/users/123"
http.route: "/api/users/{id}"
http.status_code: 200
http.request_content_length: 0
http.response_content_length: 1234
http.user_agent: "curl/7.88.0"
net.peer.ip: "10.0.1.50"
net.peer.port: 45678
net.host.ip: "10.0.1.100"
net.host.port: 8080
gRPC Tracing
span:
name: "/users.UserService/GetUser"
kind: SERVER
attributes:
rpc.system: grpc
rpc.service: users.UserService
rpc.method: GetUser
rpc.grpc.status_code: 0
net.peer.ip: "10.0.1.50"
net.peer.port: 45678
Database Tracing
span:
name: "SELECT users"
kind: CLIENT
attributes:
db.system: postgresql
db.name: mydb
db.user: appuser
db.statement: "SELECT * FROM users WHERE id = $1"
db.operation: SELECT
db.sql.table: users
net.peer.ip: "10.0.2.100"
net.peer.port: 5432
Message Queue Tracing
# Kafka produce
span:
name: "orders send"
kind: PRODUCER
attributes:
messaging.system: kafka
messaging.destination.name: orders
messaging.kafka.partition: 3
messaging.kafka.message.offset: 12345
messaging.message.payload_size_bytes: 256
# Kafka consume
span:
name: "orders receive"
kind: CONSUMER
attributes:
messaging.system: kafka
messaging.destination.name: orders
messaging.kafka.consumer.group: order-processor
messaging.kafka.partition: 3
messaging.kafka.message.offset: 12345
Runtime-Specific Tracing
Go Applications
Telegen traces Go applications at the runtime level:
Goroutine tracking - Track execution across goroutines
HTTP handlers -
net/http, Gin, Echo, Chi, FibergRPC - All gRPC calls
Database drivers -
database/sql, pgx, go-redis
Java Applications
Integration with JFR (Java Flight Recorder):
Method tracing - Hot methods and stack traces
GC events - Garbage collection correlation
Lock contention - Synchronized blocks and locks
Thread events - Thread creation, blocking
Python Applications
ASGI/WSGI - FastAPI, Django, Flask
asyncio - Async operation tracking
Database - psycopg2, SQLAlchemy, pymongo
Node.js Applications
HTTP - Express, Fastify, Koa
Async hooks - Promise and callback tracking
Database - pg, mysql2, mongodb, redis
Trace Correlation
Automatic Signal Linking
Telegen automatically correlates:
flowchart LR
subgraph Request["Single Request"]
T["Trace\n(span_id: abc123)"]
M["Metrics\n(labeled: span_id=abc123)"]
L["Logs\n(trace_id, span_id)"]
P["Profile\n(span_id: abc123)"]
end
T --- M
T --- L
T --- P
Log Correlation
Logs are automatically enriched with trace context:
{
"timestamp": "2024-01-15T10:30:00Z",
"level": "info",
"message": "User created successfully",
"trace_id": "a1b2c3d4e5f6789012345678",
"span_id": "abc123def456",
"service.name": "user-service",
"k8s.pod.name": "user-service-xyz"
}
Metric Exemplars
Metrics include exemplars linking to traces:
http_server_duration:
type: histogram
buckets: [0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10]
exemplars:
- value: 0.045
trace_id: "a1b2c3d4e5f6789012345678"
span_id: "abc123def456"
Configuration
Basic Configuration
otlp:
endpoint: "otel-collector:4317"
traces:
enabled: true
sample_rate: 1.0 # 100% sampling
Sampling
otlp:
traces:
enabled: true
# Sample 10% of traces
sample_rate: 0.1
# Head-based sampling (default)
sampler: parent_based_traceidratio
Network Filtering
agent:
ebpf:
network:
enabled: true
http: true
grpc: true
# Exclude noisy endpoints
exclude_paths:
- "/health"
- "/healthz"
- "/ready"
- "/metrics"
# Exclude by port
exclude_ports:
- 22 # SSH
- 2379 # etcd
Database Query Settings
agent:
database:
# Capture full query text
capture_queries: true
# Sanitize sensitive data
sanitize_queries: true
# Max query length
max_query_length: 1024
# Capture query parameters
capture_parameters: false # Privacy consideration
Span Enrichment
Automatic Enrichment
All spans are automatically enriched with:
Attribute |
Source |
|---|---|
|
Discovery or config |
|
Binary analysis |
|
System |
|
Kubernetes |
|
Cloud metadata |
|
System |
Custom Attributes
Add custom attributes via environment variables:
# Kubernetes deployment
env:
- name: OTEL_RESOURCE_ATTRIBUTES
value: "team=platform,cost_center=engineering"
Performance Impact
Telegen is designed for minimal overhead:
Metric |
Overhead |
|---|---|
Latency |
< 100μs per request |
CPU |
< 1% additional |
Memory |
~50MB for trace buffers |
Network |
Compressed OTLP batches |
Optimizations
Ring buffers - Efficient kernel-to-userspace transfer
Batching - Spans batched before export
Compression - gzip compression by default
Sampling - Configurable head-based sampling
Troubleshooting
Missing Traces
Check eBPF status:
# Verify eBPF programs loaded bpftool prog list | grep telegen
Check OTLP connectivity:
# Verify endpoint is reachable curl -v http://otel-collector:4317
Check sampling rate:
otlp: traces: sample_rate: 1.0 # Ensure 100% for debugging
Missing Span Correlation
Verify trace context propagation:
Check incoming requests have
traceparentheaderVerify W3C Trace Context format
Check time synchronization:
Ensure NTP is configured
Spans may appear out of order with clock drift
Next Steps
Continuous Profiling - Link profiles to traces
Database Tracing - Deep database tracing
Agent Mode Configuration - Trace configuration options