- Published on
Zero-Code LLM Instrumentation on Kubernetes with the OpenLIT Operator
Zero-Code LLM Instrumentation on Kubernetes with the OpenLIT Operator
TL;DR: Add the label openlit.io/instrument: "true" to your Kubernetes pods and the OpenLIT Operator automatically injects LLM instrumentation — no code changes, no redeployments, no SDK installation. Every OpenAI, Anthropic, or LangChain call in your Python services starts producing OpenTelemetry traces.
The Problem: Instrumenting Dozens of Services Is Tedious
You have 15 microservices. Seven of them make LLM calls. Three use OpenAI, two use Anthropic, one uses LangChain, and one uses LiteLLM.
The normal approach: go to each service, add pip install openlit, add openlit.init() to the entry point, configure the OTLP endpoint, test it, deploy it. Seven PRs, seven code reviews, seven deployments.
The operator approach: deploy the operator once, add a label to each service's pod spec, done.
How the Operator Works
The OpenLIT Operator runs as a Kubernetes controller with a mutating admission webhook. When a pod with the openlit.io/instrument: "true" label is created, the operator:
Injects an init container that installs the OpenLIT Python SDK into a shared volume
Sets up
sitecustomize.pyso that Python automatically runsopenlit.init()when the application startsConfigures environment variables for the OTLP endpoint, application name, and environment
Your application code doesn't change. It doesn't even know it's being instrumented.
┌────────────────────────────────────────────────┐
│ Pod Creation │
│ │
│ kubectl apply deployment.yaml │
│ ↓ │
│ Admission Webhook intercepts │
│ ↓ │
│ OpenLIT Operator mutates pod spec: │
│ - Adds init container (installs SDK) │
│ - Adds shared volume │
│ - Sets PYTHONPATH + env vars │
│ - Sets sitecustomize.py │
│ ↓ │
│ Pod starts → Python auto-imports openlit │
│ ↓ │
│ All LLM calls are traced via OTLP │
└────────────────────────────────────────────────┘Step-by-Step Setup
1. Deploy the Operator
kubectl apply -f https://raw.githubusercontent.com/openlit/openlit/main/operator/deploy/openlit-operator.yamlThis creates:
The operator deployment in the
openlit-systemnamespaceA
MutatingWebhookConfigurationthat watches for labeled podsThe necessary RBAC roles and service accounts
TLS certificates for webhook communication
2. Create an AutoInstrumentation Resource
The AutoInstrumentation custom resource tells the operator what to instrument and where to send the data:
apiVersion: openlit.io/v1alpha1
kind: AutoInstrumentation
metadata:
name: openlit-instrumentation
namespace: default
spec:
selector:
labels:
openlit.io/instrument: "true"
otlp:
endpoint: "http://openlit.monitoring.svc.cluster.local:4318"
python:
image: ghcr.io/openlit/openlit-python-init:latest
resource:
environment: "production"3. Label Your Deployments
Add the label to any deployment that makes LLM calls:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-chatbot
spec:
template:
metadata:
labels:
app: my-chatbot
openlit.io/instrument: "true" # <-- this is all you add
spec:
containers:
- name: chatbot
image: my-chatbot:latestApply it:
kubectl apply -f deployment.yamlThe next time a pod starts for this deployment, it will be auto-instrumented.
4. Verify It's Working
Check the pod's init containers:
kubectl describe pod -l app=my-chatbotYou should see an init container like openlit-init that ran successfully. Then check your OpenLIT dashboard or OTLP backend for incoming traces.
What Gets Instrumented
The operator injects the full OpenLIT Python SDK, which means everything that openlit.init() normally instruments gets covered:
LLM Providers: OpenAI, Anthropic, Cohere, Mistral, Groq, Bedrock, Ollama, vLLM, and 30+ more
Agent Frameworks: LangChain, LangGraph, LlamaIndex, CrewAI, Pydantic AI, OpenAI Agents
Vector Databases: Pinecone, Chroma, Qdrant, Milvus
Web Frameworks: FastAPI, Flask, Django (if your service uses them)
The operator detects which libraries are imported at runtime and instruments only what's present.
Configuring the Operator
The operator supports several configuration options via environment variables:
| Variable | Description | Default |
OPENLIT_OTLP_ENDPOINT | Where to send OTLP data | http://localhost:4318 |
OPENLIT_DEFAULT_ENVIRONMENT | Default environment label | default |
WEBHOOK_PORT | Webhook server port | 9443 |
WEBHOOK_FAILURE_POLICY | What happens if webhook fails (Fail or Ignore) | Ignore |
CERT_VALIDITY_DAYS | TLS cert validity | 365 |
Setting WEBHOOK_FAILURE_POLICY to Ignore is recommended for production — if the operator is down, pods still start normally without instrumentation rather than failing to schedule.
Excluding Specific Services
Maybe you want to instrument everything in a namespace except one sensitive service. Use the ignore field in your AutoInstrumentation resource:
spec:
selector:
labels:
openlit.io/instrument: "true"
ignore:
labels:
openlit.io/instrument: "false"Or simply don't add the label to services you want to skip.
Running OpenLIT on Kubernetes Too
If you're deploying the OpenLIT platform itself on Kubernetes, the natural setup is:
┌─────────────────────────────────────┐
│ Your K8s Cluster │
│ │
│ ┌──────────┐ ┌───────────────┐ │
│ │ OpenLIT │ │ ClickHouse │ │
│ │ Platform │◄──│ (storage) │ │
│ │ :3000 │ │ │ │
│ │ :4317/18 │ └───────────────┘ │
│ └──────────┘ │
│ ▲ │
│ │ OTLP │
│ │ │
│ ┌────┴─────┐ ┌──────────┐ │
│ │ Service A │ │ Service B│ │
│ │ (labeled) │ │ (labeled)│ │
│ └──────────┘ └──────────┘ │
└─────────────────────────────────────┘The OpenLIT Helm chart makes this straightforward:
helm repo add openlit https://openlit.github.io/helm-chart
helm repo update
helm install openlit openlit/openlitThen point your AutoInstrumentation resource's OTLP endpoint to the OpenLIT service.
Real-World Example: Instrumenting a FastAPI + LangChain Service
Here's a typical scenario. You have a FastAPI service that uses LangChain for RAG:
# app.py — no OpenLIT code anywhere
from fastapi import FastAPI
from langchain_openai import ChatOpenAI
from langchain.chains import RetrievalQA
app = FastAPI()
llm = ChatOpenAI(model="gpt-4o")
@app.post("/ask")
async def ask(question: str):
chain = RetrievalQA.from_chain_type(llm=llm, retriever=my_retriever)
result = chain.invoke(question)
return {"answer": result}Deploy it on Kubernetes with the label:
metadata:
labels:
openlit.io/instrument: "true"Without touching app.py, you now get:
Full trace of the
/askHTTP request (via FastAPI instrumentation)LangChain chain execution spans
Individual OpenAI chat completion spans with tokens, cost, latency
Retriever spans showing what documents were fetched
All stitched together in a single distributed trace.
When to Use the Operator vs. the SDK
| Scenario | Use the Operator | Use the SDK |
| Many services, don't want to modify code | Yes | |
| Need fine-grained control over what's traced | Yes | |
| Platform team managing observability centrally | Yes | |
| Single service, quick setup | Yes | |
| Non-Python services (TypeScript, Go) | Yes | |
| Brownfield codebase you can't easily modify | Yes |
The operator currently supports Python services. For TypeScript or Go, use the SDK directly.
FAQ
Does it work with non-Python services?
The operator currently supports Python auto-instrumentation. TypeScript and Go support is on the roadmap. For those languages, use the OpenLIT SDK directly.
Can I configure which providers to instrument?
Yes. You can set disabled_instrumentors via environment variables in the AutoInstrumentation resource to exclude specific libraries from instrumentation.
What happens if the operator is down?
With the default Ignore failure policy, pods continue to start normally — they just won't be instrumented. No impact on your application availability.
Does it add latency to pod startup?
The init container adds a few seconds to pod startup (it installs the SDK). Once the pod is running, the instrumentation overhead is negligible — the same as running openlit.init() directly.
Can I use it with an existing OTel Collector?
Yes. Point the otlp.endpoint in the AutoInstrumentation resource to your existing OTel Collector. The data format is standard OTLP.
- Name