Auto-Discovery
Telegen automatically discovers your infrastructure, cloud environment, and running applications.
Overview
When Telegen starts, it automatically:
Detects cloud provider - AWS, GCP, Azure, etc.
Discovers Kubernetes metadata - Pods, services, namespaces
Identifies application runtimes - Go, Java, Python, Node.js
Maps network topology - Services, connections, dependencies
No configuration required.
Cloud Detection
Telegen queries cloud metadata services to identify the environment:
Provider |
Detection Method |
Metadata Collected |
|---|---|---|
AWS |
IMDS v1/v2 |
Instance ID, region, AZ, instance type, AMI |
GCP |
Metadata server |
Instance ID, zone, machine type, project |
Azure |
IMDS |
VM ID, location, VM size, subscription |
DigitalOcean |
Metadata service |
Droplet ID, region, size |
Alibaba Cloud |
Metadata service |
Instance ID, region, zone |
Example AWS Metadata
# Automatically added to all telemetry
cloud.provider: aws
cloud.platform: aws_ec2
cloud.account.id: "123456789012"
cloud.region: us-east-1
cloud.availability_zone: us-east-1a
host.id: i-0abc123def456
host.type: m5.xlarge
host.image.id: ami-0abc123
Example Kubernetes Metadata
# Automatically added when running in K8s
k8s.cluster.name: production
k8s.namespace.name: default
k8s.pod.name: my-app-xyz123
k8s.pod.uid: a1b2c3d4-e5f6-7890-abcd-ef1234567890
k8s.deployment.name: my-app
k8s.node.name: ip-10-0-1-100.ec2.internal
k8s.container.name: app
Runtime Detection
Telegen identifies running application runtimes through process analysis:
Runtime |
Detection Method |
Auto-Instrumentation |
|---|---|---|
Go |
Binary analysis, goroutine patterns |
✅ HTTP, gRPC, database |
Java |
JVM process, JFR integration |
✅ Full JVM tracing |
Python |
Interpreter detection |
✅ HTTP, database, asyncio |
Node.js |
V8 process patterns |
✅ HTTP, database, async |
.NET |
CoreCLR detection |
✅ HTTP, database, EF Core |
Ruby |
Interpreter detection |
⚠️ Partial support |
Rust |
Binary analysis |
✅ Full tracing |
C/C++ |
Binary analysis |
✅ Network, syscalls |
Example Runtime Metadata
# Automatically detected for a Go service
process.runtime.name: go
process.runtime.version: go1.21.5
process.executable.name: api-server
process.executable.path: /app/api-server
process.pid: 12345
process.command_line: /app/api-server --port=8080
Database Detection
Telegen identifies database connections and auto-traces queries:
Database |
Detection |
Tracing Support |
|---|---|---|
PostgreSQL |
Port 5432, wire protocol |
✅ Queries, latency, errors |
MySQL |
Port 3306, wire protocol |
✅ Queries, latency, errors |
MongoDB |
Port 27017, wire protocol |
✅ Operations, aggregations |
Redis |
Port 6379, RESP protocol |
✅ Commands, latency |
Elasticsearch |
Port 9200, HTTP |
✅ Queries, bulk ops |
Message Queue Detection
Queue |
Detection |
Tracing Support |
|---|---|---|
Kafka |
Port 9092, protocol |
✅ Produce, consume, lag |
RabbitMQ |
Port 5672, AMQP |
✅ Publish, consume |
Redis Pub/Sub |
Port 6379, RESP |
✅ Publish, subscribe |
NATS |
Port 4222 |
✅ Publish, subscribe |
Service Discovery
Telegen builds a topology map of all services:
flowchart LR
subgraph Discovery["Auto-Discovery"]
A["Frontend\n(Node.js)"]
B["API Gateway\n(Go)"]
C["Order Service\n(Java)"]
D["User Service\n(Python)"]
E["PostgreSQL"]
F["Redis"]
G["Kafka"]
end
A -->|HTTP| B
B -->|gRPC| C
B -->|gRPC| D
C -->|SQL| E
D -->|SQL| E
B -->|Commands| F
C -->|Produce| G
Service Metadata
# Automatically generated service topology
service.name: order-service
service.version: 1.2.3
service.namespace: production
service.instance.id: order-service-abc123
# Detected dependencies
dependencies:
- service: postgres
type: database
protocol: postgresql
- service: kafka
type: message_queue
protocol: kafka
- service: user-service
type: service
protocol: grpc
Process Discovery (Port-Based & Path-Based)
Telegen discovers processes to instrument using port-based and/or path-based selection. Port-based discovery is often more reliable in containerized environments where executable paths vary.
Discovery Methods
Method |
Use Case |
Reliability in Containers |
|---|---|---|
Port-based |
Known service ports (8080, 3000, etc.) |
✅ High |
Path-based |
Known executable patterns |
⚠️ Medium (paths vary) |
Kubernetes metadata |
Label/namespace selectors |
✅ High |
Combined |
Port + path + K8s metadata |
✅ Highest precision |
Port-Based Discovery
Discover services by the ports they listen on:
discovery:
instrument:
# Single port
- open_ports: "8080"
# Port range
- open_ports: "8000-8999"
# Multiple ports and ranges
- open_ports: "80,443,3000,8080-8089"
Path-Based Discovery
Discover services by executable path patterns (glob syntax):
discovery:
instrument:
# Match any Java process
- exe_path: "*java*"
# Match specific application
- exe_path: "/usr/bin/myapp"
# Match Node.js processes
- exe_path: "*node*"
Combined Discovery (AND Logic)
When multiple criteria are in one entry, ALL must match:
discovery:
instrument:
# Must be: Java process AND listening on port 8080
- open_ports: "8080"
exe_path: "*java*"
# Must be: in production namespace AND on port 3000
- k8s_namespace: "production"
open_ports: "3000"
Kubernetes-Aware Discovery
Use Kubernetes metadata for precise targeting:
discovery:
instrument:
# By namespace
- k8s_namespace: "production"
# By deployment name
- k8s_deployment_name: "api-gateway"
# By pod labels
- k8s_pod_labels:
app: "frontend*"
version: "v2*"
# By pod annotations
- k8s_pod_annotations:
telegen.io/instrument: "true"
# Combined: namespace + port
- k8s_namespace: "production"
open_ports: "8080-8089"
Container-Only Discovery
Limit discovery to containerized processes:
discovery:
instrument:
- containers_only: true
open_ports: "8080"
Excluding Services
Exclude specific services from instrumentation (takes precedence over instrument):
discovery:
instrument:
- open_ports: "8080-8089"
exclude_instrument:
# Exclude health check services
- open_ports: "9090"
# Exclude test namespaces
- k8s_namespace: "*-test"
# Exclude by path
- exe_path: "*health*"
Default Exclusions
Telegen excludes itself and common observability tools by default:
discovery:
default_exclude_instrument:
- exe_path: "*telegen*"
- exe_path: "*alloy*"
- exe_path: "*otelcol*"
- k8s_namespace: "kube-system"
- k8s_namespace: "monitoring"
Full Discovery Example
discovery:
# Skip already-instrumented services
exclude_otel_instrumented_services: true
exclude_otel_instrumented_services_span_metrics: false
# Use generic tracers for all languages
skip_go_specific_tracers: false
# What to instrument
instrument:
# Instrument common application ports
- open_ports: "8080-8089"
- open_ports: "3000,5000"
# Instrument all Java apps in production
- exe_path: "*java*"
k8s_namespace: "production"
# Instrument anything with our annotation
- k8s_pod_annotations:
telegen.io/instrument: "true"
# What to exclude
exclude_instrument:
- k8s_namespace: "kube-system"
- k8s_namespace: "monitoring"
- open_ports: "9090" # Prometheus
# Timing
min_process_age: 5s
poll_interval: 5s
Configuration
Enabling/Disabling Discovery
agent:
discovery:
enabled: true
interval: 30s
# What to discover
detect_cloud: true
detect_kubernetes: true
detect_runtimes: true
detect_databases: true
detect_message_queues: true
Cloud-Specific Settings
cloud:
aws:
enabled: true
timeout: 200ms
refresh_interval: 15m
collect_tags: true
tag_allowlist:
- "app_*"
- "env"
- "team"
- "cost_center"
gcp:
enabled: true
timeout: 200ms
refresh_interval: 15m
azure:
enabled: true
timeout: 200ms
refresh_interval: 15m
Kubernetes Settings
agent:
kubernetes:
enabled: true
# Metadata to collect
pod_metadata: true
node_metadata: true
service_metadata: true
# Label filtering
label_allowlist:
- "app.kubernetes.io/*"
- "helm.sh/*"
- "app"
- "version"
- "team"
# Namespace filtering
namespace_include: [] # Empty = all
namespace_exclude:
- kube-system
- kube-public
- kube-node-lease
Resource Attributes
All discovered metadata is attached as OpenTelemetry resource attributes:
Cloud Attributes (Semantic Conventions)
Attribute |
Description |
|---|---|
|
Cloud provider (aws, gcp, azure) |
|
Platform (aws_ec2, gcp_compute_engine) |
|
Cloud region |
|
Availability zone |
|
Account/project ID |
|
Instance ID |
|
Instance type |
Kubernetes Attributes
Attribute |
Description |
|---|---|
|
Cluster name |
|
Namespace |
|
Pod name |
|
Pod UID |
|
Deployment name |
|
ReplicaSet name |
|
Node name |
|
Container name |
Process Attributes
Attribute |
Description |
|---|---|
|
Process ID |
|
Executable name |
|
Full path |
|
Command line |
|
Runtime (go, java, python) |
|
Runtime version |
Best Practices
1. Use Label Allowlists
Avoid collecting unnecessary labels that increase cardinality:
agent:
kubernetes:
label_allowlist:
- "app"
- "version"
- "team"
# NOT: "*" (collects everything)
2. Set Reasonable Timeouts
Fast timeouts prevent slow cloud APIs from blocking:
cloud:
aws:
timeout: 200ms # Quick timeout
refresh_interval: 15m # Cache results
3. Exclude System Namespaces
Reduce noise from infrastructure components:
agent:
kubernetes:
namespace_exclude:
- kube-system
- kube-public
- monitoring
- logging
Next Steps
Distributed Tracing - How auto-discovered services are traced
Agent Mode Configuration - Full agent configuration