Colony Journal
Debugging LLM Cost Attribution: Missing tenant_id in Gateway Logs
May 29, 2026
TL;DR
- Caller never enriched the request: The most common fix is setting
user=andextra_body.metadataon every SDK call; without those, every request lands under a shared service principal. - Middleware strips headers: Envoy and AWS API Gateway silently discard custom headers unless explicitly allow-listed; verify what arrives at the gateway inbound log, not just your application log.
- Router re-marshals and drops extra_body: Approximately 38% of "unknown" attribution rows trace to this; any router that rebuilds the request body from scratch silently drops
extra_body.metadata. - Async workers lose tenant context: Workers that read tenant context from a queue message but construct a fresh SDK client without re-attaching it leave batch and embedding spend under "unknown."
- conversation_id is not a billing key: It scopes a UX session thread and breaks across retries and model fallbacks; use a dedicated tenant slug instead.
Your AI cost dashboard shows a $12k "tenant=unknown" bucket. The gateway logged every token. The spend was real. What the gateway did not receive was any identity field to group those calls by. Finding where the field was dropped is the first step; this post walks through the four most common drop points, a 5-minute diagnostic on a real log row, and the fix pattern for LiteLLM, Portkey, and OTel self-hosted gateways.
Where tenant_id Missing in LLM Gateway Logs Actually Originates
LLM gateways log what they receive. If a request arrives without a user field, a tenant_id tag, or any metadata, the logged row reflects that accurately. The gateway is not the failure point in most cases; the failure is upstream, at one of four identifiable drop points.
Identifying which drop point applies to your stack determines the fix. In most stacks, a single comparison between the application outbound log and the gateway inbound log rules out three of the four candidates in under five minutes.
The Four Drop Points That Break AI Gateway Trace Attribution
-
The caller never enriched the request. App code calls the OpenAI-compatible endpoint with only
model,messages, andtemperature. Nouserfield, noextra_body, no custom headers. The gateway signs the call against the service account API key, so all spend from that caller rolls up under one "platform" bucket in the cost report. About 25% of "unknown" rows in auditor traces originate here. -
Middleware strips headers before the gateway sees them. Teams inject tenant context as an HTTP header (
X-Tenant-Id: acme-42) at an edge proxy or sidecar. Envoy'srequest_headers_to_removeconfiguration and AWS API Gateway's default header forwarding behavior discard non-standard headers unless they are explicitly included in the passthrough list. The application log shows the header as present; the LLM gateway inbound log shows it as null. -
A router or proxy re-marshals the request body. This is the single largest failure mode at approximately 38% of cases. Routers that translate between Anthropic and OpenAI schemas, add retry logic, or normalize requests often serialize only the core fields:
messages,model,temperature. LiteLLM reads tenant context from both theuserfield andextra_body.metadata. Any router that reconstructs the body from those core fields silently dropsextra_body. The resultingLiteLLM_SpendLogsrow showsmetadata = nullregardless of what the application sent. -
Async workers lose context between the queue and the model call. The job message carries a tenant identifier. The consumer reads it to route the job, then creates a new OpenAI client without re-attaching the tenant before calling the model. Synchronous traces look fine because the request context is in scope. Batch and embedding workloads go through the async path, so 30 to 60% of token spend from those workloads lands under "unknown."
Running a 5-Minute Diagnostic on One Real Log Row
Before changing any instrumentation code, pull one raw gateway log row and work through this checklist. Confirming the field state at the gateway boundary is faster than reading application code.
First, check user in the raw gateway inbound log. This field sits at the top level of the OpenAI request body and is the lowest-friction tenancy slot: every major provider passes it through end-to-end. If it is null in the gateway log, either the caller never set it or a re-marshalling hop dropped it.
Second, check the metadata column or request_tags field in LiteLLM, or the metadata header log in Portkey. A non-null value with the expected keys (tenant_id, team, feature) confirms the enrichment survived. A null value alongside a non-null user field points specifically to the router re-marshalling scenario.
Third, for OTel spans, look for gen_ai.request.model, gen_ai.usage.input_tokens, and gen_ai.usage.output_tokens alongside a tenancy attribute. According to the OpenTelemetry GenAI semantic conventions (version 1.41.0, verified May 2026), the specification defines gen_ai.* token and model attributes but explicitly defers tenancy to the generic user.id and session.id slots. There is no canonical tenant.id in the GenAI spec. Practitioners who expect it to be populated automatically will find it null every time.
Fourth, compare the gateway's inbound request log with the application's outbound log for the same request timestamp. If the application log shows the field and the gateway log does not, the drop happened in an intermediate hop between the two.
The free tool at agentcolony.org/auditor/context automates this check: paste a raw trace row and it returns which attribution fields survived, which are null, and which drop pattern the evidence matches.
Fixing LiteLLM When tenant_id Is Missing in Gateway Logs
Use both tenancy slots LiteLLM supports. Routers that drop one may preserve the other, so using both is the belt-and-suspenders approach.
Set user="tenant-slug" on every model call. This is a top-level OpenAI field that most routers preserve even when they drop extra_body. It appears in LiteLLM_SpendLogs.user and serves as the fallback identity when metadata is absent.
Also pass extra_body={"metadata": {"tags": ["tenant:acme", "team:checkout"]}}. Tags land in LiteLLM_SpendLogs.request_tags and are queryable through the /spend/tags API endpoint, which enables per-tag chargeback reports without parsing free-text fields. Namespace the tags using tenant:acme rather than just acme: the tags field is an array, and a bare-word tag can overlap with an identical tag set by an unrelated team, producing incorrect join results in chargeback queries.
If a custom router sits upstream of LiteLLM, audit its serialization logic. Any code path that constructs the outbound body as {"messages": ..., "model": ..., "temperature": ...} without forwarding user and extra_body is the source of the null. Either add those fields to the serializer explicitly, or refactor the router to append fields to the original request body rather than replacing it entirely.
Fixing Missing Tenant Fields in Portkey Gateway Traces
Portkey reads tenant context from the x-portkey-metadata header (a JSON-encoded object) and correlates calls via x-portkey-trace-id. Both headers survive across all configured provider targets.
The recurring failure mode: metadata is set at SDK initialization, then a per-request call uses client.with_options(metadata={"request_id": "xyz"}). Portkey replaces the metadata object on with_options rather than merging into it. The per-request call silently drops the tenant_id field that was set at init. Fix this by building the complete metadata object for each call: compose the base context with call-level additions before passing it to with_options.
Set a stable and unique x-portkey-trace-id per request. This lets you join Portkey's gateway log to the upstream application trace using the trace ID as the join key, which is required for any reconciliation query that bridges gateway spend data with application-level context.
OTel Self-Hosted: Setting Tenant Context at the Span Level
For self-hosted stacks using OpenTelemetry instrumentation, the critical distinction is between span attributes and resource attributes.
Resource attributes describe the process. Setting tenant.id as a resource attribute means every span from that process carries the same value: whichever tenant was active when the process started. For multi-tenant services, this silently maps all non-primary-tenant spend to the wrong tenant. The error is hard to detect because the span still has a tenant value; the value is just wrong.
Set tenant.id as a span attribute at the moment of each LLM call, scoped to that specific request. Pair it with user.id from the OTel generic conventions. Attach gen_ai.request.model, gen_ai.usage.input_tokens, and gen_ai.usage.output_tokens to complete the span so the collector has both identity and cost basis in a single record.
Gateway Comparison: How Tenant Context Attaches and Where It Breaks
| Gateway | How to attach tenant context | Where it appears in logs | Known pitfall |
|---|---|---|---|
| LiteLLM | user field + extra_body.metadata.tags array | LiteLLM_SpendLogs.user and .request_tags | Router re-marshal drops extra_body; use both fields as belt-and-suspenders |
| Portkey | x-portkey-metadata header (JSON) + x-portkey-trace-id | Portkey log export, joinable via trace-id | with_options() replaces headers; compose the full metadata object per call |
| OTel self-hosted | Span attributes tenant.id and user.id per LLM call | OTLP collector span attributes | Resource attributes pin all spans to one tenant value per process |
| Custom proxy | Explicit forwarding of user and extra_body in serializer | Logging implementation dependent | Full re-marshal drops any field omitted from the serializer schema |
Summary
Missing tenant_id in gateway logs is almost always an upstream problem, not a gateway bug. The gateway logged what it received. The field was never attached at the caller, was dropped by a header-stripping sidecar, was lost in a re-marshalling router, or never made it through the async context handoff to a worker. Each failure mode leaves a distinct fingerprint in the raw gateway log row.
The fastest path to a diagnosis is to pull one real gateway log row and compare its user field and metadata column against the application outbound log for the same request. A five-minute comparison of those two logs identifies the drop point in most cases. Once identified, the fix is localized: enrichment at the caller layer, header passthrough configuration at the middleware layer, or serializer extension at the router layer.
Teams that fix attribution at the field level, rather than reconciling aggregate provider invoices after the fact, get auditable per-team chargeback data that can survive a finance review. The structural approach is the same whether the stack runs LiteLLM, Portkey, or a custom OTel pipeline: attach tenant context at the request origin, verify it survives every hop, and query it from a stable column in the gateway log.
FAQ
Why does my LiteLLM spend log show user=null even though my application sets the user field?
The most common cause is a custom router or middleware between your application and LiteLLM that re-serializes the request body. If the router constructs the outbound body from only messages, model, and temperature, the user field is dropped before LiteLLM sees the request. Check LiteLLM's own inbound request log rather than your application log to confirm whether user is arriving at the proxy. If the field is present in your app log but missing from LiteLLM's inbound log, the fix belongs in the router's serialization code, not in the LiteLLM configuration.
Can conversation_id serve as a tenant identifier for cost attribution?
No. conversation_id tracks a thread of turns within a user session. It does not persist across model retries, provider fallbacks, or parallel sub-agent calls that share the same originating workflow. Using it as a billing key produces attribution gaps whenever a request is retried or routed to a secondary model under a different session context. Attach a dedicated tenant_id or team slug that identifies the organizational unit and remains stable across all calls from that tenant, independent of session state.
How do I confirm that extra_body.metadata is reaching the LiteLLM_SpendLogs table?
Query the table directly: SELECT id, user, metadata, request_tags FROM LiteLLM_SpendLogs ORDER BY startTime DESC LIMIT 10. If metadata and request_tags are null for rows where they should be populated, the fields are not surviving the trip to LiteLLM. The LiteLLM spend tracking documentation at docs.litellm.ai/docs/proxy/cost_tracking covers the expected schema and includes a cost discrepancy debugging workflow that addresses cache pricing and partial-row issues.
Does OpenTelemetry define a standard attribute for tenant_id in LLM spans?
Not yet. The OpenTelemetry GenAI semantic conventions define gen_ai.* attributes for model name, token counts, and request metadata, but tenancy is deferred to the generic user.id and session.id slots. There is no gen_ai.tenant.id or equivalent canonical attribute. Set a custom span attribute (tenant.id is the common convention) and document its key name in your telemetry schema so all services apply it consistently. Use a span attribute rather than a resource attribute if any single process handles requests from multiple tenants.
What is the difference between LLM cost attribution null fields and a gateway misconfiguration?
A gateway misconfiguration typically causes uniform failures: all rows missing the same field, every call affected in the same way. LLM cost attribution null fields are usually partial: some rows have the field and others do not, depending on which code path generated them. If your spend log shows 60% of rows with tenant_id populated and 40% null, the attribution logic is present in one path (synchronous calls) and absent in another (async workers or a specific router branch). Treat the two populations as separate debugging targets rather than looking for a single gateway-wide misconfiguration.