@kaged/llm

Pure-fetch LLM provider interface supporting Anthropic, OpenAI, Google, and Antigravity API shapes with SSE streaming, cost calculation, and model discovery

41
source files
19
test files
~13.4k
lines
✓ 358 pass
tests
pass
typecheck
clean
lint

Test results 358

resolveApiShape > anthropic → anthropic-messages [0.060ms]
resolveApiShape > openai → openai-completions [0.020ms]
resolveApiShape > codex → openai-codex-responses [0.020ms]
resolveApiShape > google → google-generative-ai [0.020ms]
resolveApiShape > openai-compatible providers map correctly [0.060ms]
resolveApiShape > unknown provider returns undefined [0.030ms]
getDefaultBaseUrl > anthropic base URL [0.030ms]
getDefaultBaseUrl > openai base URL [0.020ms]
getDefaultBaseUrl > codex base URL [0.020ms]
getDefaultBaseUrl > ollama is localhost [0.020ms]
getDefaultBaseUrl > unknown returns undefined [0.020ms]
knownProviders > returns all 14 providers [0.060ms]
streamModel > unknown provider returns error event immediately [0.900ms]
streamModel > routes anthropic to /v1/messages [4.85ms]
streamModel > routes openai to /v1/chat/completions [1.81ms]
streamModel > routes groq (openai-compatible) to /v1/chat/completions [0.300ms]
streamModel > routes google to :streamGenerateContent [1.75ms]
streamModel > uses default base URL when route has none [0.480ms]
completeModel > returns final AssistantMessage [0.350ms]
completeModel > unknown provider returns error message [0.100ms]
estimateTokens — algorithm selection > uses tiktoken for anthropic models [167.21ms]
estimateTokens — algorithm selection > uses tiktoken for openai models [0.180ms]
estimateTokens — algorithm selection > uses fallback for google/gemini models [0.070ms]
estimateTokens — algorithm selection > uses fallback for groq models [0.040ms]
estimateTokens — algorithm selection > uses fallback when modelMeta is null [0.030ms]
estimateTokens — reservedOutputTokens > defaults to 4096 [0.120ms]
estimateTokens — reservedOutputTokens > echoes custom value [0.110ms]
estimateTokens — reservedOutputTokens > totalTokens = inputTokens + reservedOutputTokens [0.160ms]
estimateTokens — context window and fraction > contextWindow comes from modelMeta.maxInputTokens [0.120ms]
estimateTokens — context window and fraction > contextWindow is null when modelMeta is null [0.030ms]
estimateTokens — context window and fraction > fraction is totalTokens / contextWindow [0.110ms]
estimateTokens — context window and fraction > fraction uses FALLBACK_CONTEXT_WINDOW when modelMeta is null [0.040ms]
estimateTokens — token counting > inputTokens increases with more messages [0.200ms]
estimateTokens — token counting > inputTokens increases with longer messages [0.140ms]
estimateTokens — token counting > inputTokens increases with longer system prompt [0.220ms]
estimateTokens — token counting > system prompt as array is joined and counted [0.180ms]
estimateTokens — token counting > empty messages and empty system prompt produces minimal tokens [0.060ms]
estimateTokens — message types > counts system messages [0.080ms]
estimateTokens — message types > counts tool result messages [0.160ms]
estimateTokens — message types > counts assistant tool call messages [0.130ms]
estimateTokens — message types > counts thinking content in assistant messages [0.150ms]
estimateTokens — message types > counts multimodal user messages with images [0.170ms]
estimateTokens — conservative estimation > estimate is non-zero for any non-empty input [0.110ms]
estimateTokens — conservative estimation > tiktoken and fallback both produce positive counts for same input [0.140ms]
ModelMeta — tokenizer field > anthropic models have tiktoken tokenizer [0.060ms]
ModelMeta — tokenizer field > openai models have tiktoken tokenizer [0.040ms]
ModelMeta — tokenizer field > google models have gemini tokenizer [0.030ms]
ModelMeta — tokenizer field > groq models have llama tokenizer [0.040ms]
ModelMeta — tokenizer field > deepseek models have llama tokenizer [0.040ms]
ModelMeta — tokenizer field > xai models have unknown tokenizer [0.040ms]
estimateTokens — large message lists > handles 100 messages without error [2.89ms]
estimateTokens — plugin memory in system prompt > counts plugin-wrapped content in system prompt [0.620ms]
kagedModel — LanguageModelV2 shape > specificationVersion is 'v2' [0.060ms]
kagedModel — LanguageModelV2 shape > provider and modelId reflect the route [0.030ms]
kagedModel — LanguageModelV2 shape > exposes supportedUrls (empty map for v0) [0.030ms]
kagedModel — LanguageModelV2 shape > doStream and doGenerate are functions [0.020ms]
kagedModel — inbound mapping (CallOptions → kaged Context + StreamOptions) > system message becomes systemPrompt[] [1.03ms]
kagedModel — inbound mapping (CallOptions → kaged Context + StreamOptions) > multiple system messages collect into systemPrompt[] [0.240ms]
kagedModel — inbound mapping (CallOptions → kaged Context + StreamOptions) > user text-only content maps to UserMessage with string content [0.220ms]
kagedModel — inbound mapping (CallOptions → kaged Context + StreamOptions) > assistant text content maps to AssistantMessage [0.270ms]
kagedModel — inbound mapping (CallOptions → kaged Context + StreamOptions) > tool message maps to ToolResultMessage [0.390ms]
kagedModel — inbound mapping (CallOptions → kaged Context + StreamOptions) > tool message with multiple tool-result parts emits one ToolResultMessage per part [0.370ms]
kagedModel — inbound mapping (CallOptions → kaged Context + StreamOptions) > tool message with zero valid parts produces no messages [0.200ms]
kagedModel — inbound mapping (CallOptions → kaged Context + StreamOptions) > tools array translates to Context.tools [0.220ms]
kagedModel — inbound mapping (CallOptions → kaged Context + StreamOptions) > provider-defined tools are skipped (kaged proxies function tools only) [0.180ms]
kagedModel — inbound mapping (CallOptions → kaged Context + StreamOptions) > temperature, maxOutputTokens, topP, stopSequences map to StreamOptions [0.200ms]
kagedModel — inbound mapping (CallOptions → kaged Context + StreamOptions) > abortSignal is forwarded [0.220ms]
kagedModel — inbound mapping (CallOptions → kaged Context + StreamOptions) > headers are forwarded (undefined values stripped) [0.170ms]
kagedModel — inbound mapping (CallOptions → kaged Context + StreamOptions) > toolChoice 'auto' maps to 'auto' [0.150ms]
kagedModel — inbound mapping (CallOptions → kaged Context + StreamOptions) > toolChoice 'tool' maps to { type: 'function', name } [0.170ms]
kagedModel — outbound mapping (StreamEvent → LanguageModelV2StreamPart) > stream-start is emitted before any kaged events [0.490ms]
kagedModel — outbound mapping (StreamEvent → LanguageModelV2StreamPart) > text_delta becomes text-delta with stable id and matching delta [0.450ms]
kagedModel — outbound mapping (StreamEvent → LanguageModelV2StreamPart) > thinking_delta becomes reasoning-delta [0.300ms]
kagedModel — outbound mapping (StreamEvent → LanguageModelV2StreamPart) > toolcall_end becomes a single tool-call part with JSON-stringified input [0.270ms]
kagedModel — outbound mapping (StreamEvent → LanguageModelV2StreamPart) > done with reason 'stop' maps to finish with reason 'stop' and usage [0.220ms]
kagedModel — outbound mapping (StreamEvent → LanguageModelV2StreamPart) > done with reason 'length' maps to finish reason 'length' [0.200ms]
kagedModel — outbound mapping (StreamEvent → LanguageModelV2StreamPart) > done with reason 'toolUse' maps to finish reason 'tool-calls' [0.190ms]
kagedModel — outbound mapping (StreamEvent → LanguageModelV2StreamPart) > error event emits an error part followed by a finish with reason 'error' [0.260ms]
kagedModel — doGenerate > returns content with text blocks, finishReason, usage, and empty warnings [0.310ms]
kagedModel — doGenerate > translates tool calls in content [0.210ms]
kagedModel — doGenerate > translates thinking content to reasoning blocks [0.150ms]
lookupModelMeta — key normalization > anthropic provider resolves bare-key models (claude-sonnet-4-20250514) [0.050ms]
lookupModelMeta — key normalization > openai provider resolves bare-key models (gpt-4o) [0.040ms]
lookupModelMeta — key normalization > google provider resolves gemini/-prefixed models [0.030ms]
lookupModelMeta — key normalization > xai provider resolves xai/-prefixed models [0.030ms]
lookupModelMeta — key normalization > groq provider resolves groq/-prefixed models [0.030ms]
lookupModelMeta — key normalization > deepseek provider resolves deepseek/-prefixed models [0.020ms]
lookupModelMeta — unknown models > returns null for unknown provider [0.020ms]
lookupModelMeta — unknown models > returns null for known provider but unknown model [0.020ms]
lookupModelMeta — unknown models > returns null for empty strings [0.030ms]
lookupModelMeta — capabilities > Claude Sonnet reports reasoning and vision capabilities [0.040ms]
lookupModelMeta — capabilities > GPT-4o reports vision and function calling [0.030ms]
lookupModelMeta — capabilities > xai grok-3-mini reports reasoning support [0.030ms]
lookupModelMeta — capabilities > mode is always 'chat' for catalog entries [0.030ms]
lookupModelMeta — pricing > extracts input and output cost per token [0.030ms]
lookupModelMeta — pricing > extracts cache pricing when available [0.030ms]
lookupModelMeta — pricing > cache pricing is null when model lacks it [0.030ms]
lookupModelMeta — pricing > extracts token limits [0.030ms]
calculateCost — with metadata > computes cost from usage and pricing [0.150ms]
calculateCost — with metadata > reasoning tokens are subtracted from output tokens for costing [0.060ms]
calculateCost — with metadata > zero usage produces zero cost [0.050ms]
calculateCost — with metadata > output tokens never go negative when reasoningTokens exceeds output [0.050ms]
calculateCost — null meta > returns all-zero breakdown when meta is null [0.050ms]
calculateCost — reasoning fallback > uses output rate when reasoning rate is null [0.050ms]
resolveModelMeta — with LiteLLM defaults > no overrides returns default meta with all sources as 'default' [0.190ms]
resolveModelMeta — with LiteLLM defaults > scalar override replaces default and marks source as 'override' [0.130ms]
resolveModelMeta — with LiteLLM defaults > nested pricing override with dot notation [0.090ms]
resolveModelMeta — with LiteLLM defaults > capability boolean override [0.060ms]
resolveModelMeta — with LiteLLM defaults > multiple overrides apply simultaneously [0.080ms]
resolveModelMeta — with LiteLLM defaults > does not mutate the original LiteLLM default [0.060ms]
resolveModelMeta — model not in LiteLLM > builds synthetic meta from overrides only [0.130ms]
resolveModelMeta — model not in LiteLLM > missing fields default to null/false [0.060ms]
resolveModelMeta — model not in LiteLLM > all sources are 'default' with no overrides [0.050ms]
resolveModelMeta — null value handling > override with null value sets field to null [0.050ms]
resolveModelMeta — null value handling > override with null pricing field [0.050ms]
humanizeModelId > replaces hyphens with spaces and title-cases [0.100ms]
humanizeModelId > replaces underscores with spaces and title-cases [0.020ms]
humanizeModelId > handles mixed separators [0.030ms]
humanizeModelId > collapses consecutive separators [0.020ms]
humanizeModelId > returns single word title-cased [0.020ms]
humanizeModelId > preserves already-capitalized letters [0.020ms]
humanizeModelId > handles empty string [0.020ms]
humanizeModelId > preserves dots and version numbers [0.020ms]
humanizeModelId > handles real model IDs [0.030ms]
parsePartialJson > valid complete JSON object [0.090ms]
parsePartialJson > empty string returns {} [0.020ms]
parsePartialJson > whitespace-only returns {} [0.020ms]
parsePartialJson > non-object (array) returns {} [0.020ms]
parsePartialJson > non-object (null) returns {} [0.010ms]
parsePartialJson > non-object (number) returns {} [0.020ms]
parsePartialJson > non-object (string) returns {} [0.020ms]
parsePartialJson > closes unclosed object [0.220ms]
parsePartialJson > closes unclosed string value [0.050ms]
parsePartialJson > closes nested objects [0.030ms]
parsePartialJson > closes nested array in object [0.040ms]
parsePartialJson > handles trailing comma [0.030ms]
parsePartialJson > handles incomplete key-value after comma [0.030ms]
parsePartialJson > handles incomplete first key-value (colon only) [0.020ms]
parsePartialJson > complex nested partial JSON [0.050ms]
parsePartialJson > escaped quotes in string preserved [0.030ms]
parsePartialJson > escaped backslash not treated as escape [0.020ms]
parsePartialJson > unparseable garbage returns {} [0.030ms]
parsePartialJson > deeply nested partial [0.030ms]
parsePartialJson > boolean values [0.030ms]
parsePartialJson > null value in object [0.040ms]
parseSseStream > parses basic event + data frame [0.380ms]
parseSseStream > null event when no event field [0.200ms]
parseSseStream > joins multi-line data with newline [0.150ms]
parseSseStream > ignores comment lines [0.140ms]
parseSseStream > comment-only frame yields nothing [0.130ms]
parseSseStream > [DONE] sentinel terminates stream [0.200ms]
parseSseStream > multiple frames in single chunk [0.230ms]
parseSseStream > frame split across chunks [0.210ms]
parseSseStream > strips optional space after colon [0.160ms]
parseSseStream > preserves space when colon has space [0.140ms]
parseSseStream > flushes trailing data without final double-newline [0.130ms]
parseSseStream > empty body yields no frames [0.100ms]
parseSseStream > parses JSON in data field [0.200ms]
parseSseStream > handles empty data lines [0.170ms]
parseSseStream > [DONE] with trailing whitespace before double-newline [0.120ms]
LlmEventStream > yields pushed events via for-await [1.09ms]
LlmEventStream > result() resolves with done message [2.21ms]
LlmEventStream > result() resolves with error AssistantMessage on error event [0.230ms]
LlmEventStream > abort() rejects result() promise [0.160ms]
LlmEventStream > push after done is ignored [0.180ms]
LlmEventStream > async push wakes waiting consumer [5.60ms]
LlmEventStream > abort() ends iteration and rejects result() [5.31ms]
LlmEventStream > multiple text deltas accumulate in order [5.45ms]
LlmEventStream > error event stops iteration [0.160ms]
device-code OAuth > startDeviceCodeFlow returns user code payload [0.530ms]
device-code OAuth > pollDeviceCodeToken handles authorization_pending and slow_down before success [855.70ms]
device-code OAuth > loginDeviceCode stores tokens after background polling completes [311.52ms]
streamAnthropic: text streaming > emits start → text_start → text_deltas → text_end → done [1.80ms]
streamAnthropic: text streaming > accumulates text in partial message [1.23ms]
streamAnthropic: text streaming > extracts usage from message_start and message_delta [1.20ms]
streamAnthropic: text streaming > stopReason is 'stop' for end_turn [1.20ms]
streamAnthropic: tool use > emits toolcall_start → toolcall_delta → toolcall_end [1.04ms]
streamAnthropic: tool use > stopReason is 'toolUse' [1.26ms]
streamAnthropic: tool use > tool call has correct name and id [1.04ms]
streamAnthropic: tool-use streaming argument reconstruction > key split across deltas — mid-key boundary preserves number arg [1.41ms]
streamAnthropic: tool-use streaming argument reconstruction > value split across deltas — mid-string-value preserves both args [0.880ms]
streamAnthropic: tool-use streaming argument reconstruction > numeric value split across deltas — multi-digit number reconstructs [0.810ms]
streamAnthropic: tool-use streaming argument reconstruction > escape sequence split across deltas — escaped quote preserved [0.760ms]
streamAnthropic: tool-use streaming argument reconstruction > many small deltas — single-character chunks reconstruct full args [2.69ms]
streamAnthropic: tool-use streaming argument reconstruction > empty args (no deltas) — tool call still emits empty object [0.780ms]
streamAnthropic: tool-use streaming argument reconstruction > toolcall_delta events carry the raw partial_json chunk that was received [1.23ms]
streamAnthropic: tool-use streaming argument reconstruction > parallel tool calls with interleaved deltas — each buffer is isolated by index [1.71ms]
streamAnthropic: thinking > emits thinking events before text [0.820ms]
streamAnthropic: thinking > thinking content captured in message [0.920ms]
streamAnthropic: error handling > HTTP error returns error event [0.760ms]
streamAnthropic: error handling > null body returns error event [0.510ms]
streamAnthropic: error handling > fetch network error returns error event [1.80ms]
streamAnthropic: request format > sends X-Api-Key for direct API [0.730ms]
streamAnthropic: request format > sends Bearer token for proxied endpoint [0.690ms]
streamAnthropic: request format > sends model in request body [0.730ms]
streamAnthropic: request format > includes system prompt when provided [0.670ms]
streamAnthropic: request format > stream is always true [0.560ms]
streamAnthropic: request format > consecutive toolResult messages merge into one user message with multiple tool_result blocks [1.59ms]
streamAnthropic: request format > empty text blocks are stripped from user content arrays [1.07ms]
streamAnthropic: request format > empty text blocks are stripped from assistant content arrays [1.02ms]
streamAnthropic: request format > empty text blocks are stripped from toolResult content arrays [0.990ms]
streamAnthropic: request format > single toolResult still produces one user message with one tool_result block [0.960ms]
streamAntigravity: text streaming > emits start → text_start → text_deltas → text_end → done [7.72ms]
streamAntigravity: text streaming > accumulates text content [1.04ms]
streamAntigravity: text streaming > stopReason is 'stop' for STOP finishReason [0.720ms]
streamAntigravity: usage tracking > extracts usage from usageMetadata [0.670ms]
streamAntigravity: usage tracking > updates usage from every SSE frame, not just last [0.750ms]
streamAntigravity: usage tracking > handles usageMetadata-only frames gracefully [1.06ms]
streamAntigravity: usage tracking > captures reasoning tokens in usage [0.910ms]
streamAntigravity: tool calls > emits toolcall_start + toolcall_end for functionCall [0.750ms]
streamAntigravity: tool calls > tool call has correct name and arguments [0.720ms]
streamAntigravity: tool calls > stopReason is 'toolUse' when function calls present [0.880ms]
streamAntigravity: thinking > emits thinking events for thought parts [0.680ms]
streamAntigravity: thinking > captures thinking content and text [0.940ms]
streamAntigravity: thinking > preserves thinking and text from split and coalesced SSE frames [0.860ms]
streamAntigravity: thinking > interleaved thought parts stay positioned in order, not merged [1.05ms]
streamAntigravity: thinking > interleaved thought parts emit paired start/end events per burst [0.820ms]
streamAntigravity: thinking > parses newline-only terminated data lines (real Antigravity wire format) [0.800ms]
streamAntigravity: error handling > HTTP error returns error event [0.400ms]
streamAntigravity: error handling > null body returns error event [0.350ms]
streamAntigravity: error handling > 429 extracts rate-limit info from structured body [1.26ms]
streamAntigravity: error handling > 429 falls back to parsing duration from message text [0.530ms]
streamAntigravity: error handling > 429 uses retry-after-ms header when present [0.410ms]
streamAntigravity: error handling > SAFETY finishReason produces error stop reason [0.630ms]
streamAntigravity: request format > sends to v1internal:streamGenerateContent URL (model NOT in path) [0.500ms]
streamAntigravity: request format > sends Bearer token in Authorization header [0.680ms]
streamAntigravity: request format > sends Antigravity User-Agent header [0.540ms]
streamAntigravity: request format > does NOT send X-Goog-Api-Client header (Antigravity mode) [0.530ms]
streamAntigravity: request format > wraps body in Antigravity envelope {project, model, request, requestType, userAgent, requestId} [0.710ms]
streamAntigravity: request format > does not include model in URL path (unlike Google) [0.520ms]
streamAntigravity: request format > includes systemInstruction when system prompt provided [0.580ms]
streamAntigravity: request format > Gemini model uses includeThoughts in thinkingConfig [0.800ms]
streamAntigravity: request format > Claude model uses snake_case thinking config (include_thoughts, thinking_budget) [0.580ms]
streamAntigravity: request format > Claude model high effort uses thinking_budget 32768 [0.590ms]
streamAntigravity: request format > Gemini model high effort uses -1 (unlimited) budget [0.590ms]
streamAntigravity: request format > does not send generationConfig when no options set [0.560ms]
streamAntigravity: request format > uses projectId from route.defaultOptions when available [0.670ms]
streamAntigravity: request format > generates synthetic project ID when none provided [0.580ms]
streamAntigravity: request format > systemInstruction has role 'user' and includes Antigravity prefix [0.590ms]
streamAntigravity: request format > Claude model sets toolConfig functionCallingConfig mode VALIDATED [2.72ms]
streamAntigravity: request format > Gemini model does NOT set toolConfig mode [1.09ms]
streamAntigravity: request format > strips unsupported schema keywords from tool parameters [0.690ms]
streamAntigravity: request format > Claude tool with empty schema gets placeholder property [0.620ms]
streamAntigravity: request format > Gemini-3 model uses thinkingLevel string in thinkingConfig [0.630ms]
streamAntigravity: request format > requestId is unique per request [0.680ms]
streamAntigravity: request format > strips thinking blocks from assistant messages in contents [0.590ms]
streamAntigravity: request format > propagates tool call id into functionCall and functionResponse in contents [0.890ms]
streamAntigravity: request format > drops empty user and assistant history turns after Antigravity sanitation [0.780ms]
streamAntigravity: Claude thinking model specifics > sends anthropic-beta header for Claude thinking model [0.430ms]
streamAntigravity: Claude thinking model specifics > does NOT send anthropic-beta for non-thinking Claude model [0.390ms]
streamAntigravity: Claude thinking model specifics > does NOT send anthropic-beta for Gemini model [0.420ms]
streamAntigravity: Claude thinking model specifics > defaults maxOutputTokens to 64000 when thinking budget is set [0.410ms]
streamAntigravity: Claude thinking model specifics > keeps higher maxOutputTokens if explicitly set above budget [0.480ms]
streamAntigravity: Claude thinking model specifics > appends interleaved thinking hint to system instruction when tools present [0.650ms]
streamAntigravity: Claude thinking model specifics > does NOT append interleaved thinking hint when no tools [0.510ms]
streamAntigravity: Claude thinking model specifics > Claude thinking uses snake_case thinking config with include_thoughts [0.480ms]
streamAntigravity: Claude schema sanitization > uses Antigravity schema hints instead of Gemini nullable fields [0.920ms]
streamAntigravity: Claude schema sanitization > flattens unions and removes unsupported draft keywords for Claude [1.05ms]
streamAntigravity: Claude schema sanitization > adds the reference placeholder for empty Claude object schemas [0.550ms]
streamAntigravity: protobuf schema sanitization > converts nullable union type [T, null] to single type + nullable [0.660ms]
streamAntigravity: protobuf schema sanitization > converts multi-type union [string, number] to anyOf branches [0.530ms]
streamAntigravity: protobuf schema sanitization > resolves $ref against $defs and inlines the definition [0.710ms]
streamAntigravity: protobuf schema sanitization > resolves $ref against definitions (draft-07 keyword) [0.500ms]
streamAntigravity: protobuf schema sanitization > unresolvable $ref falls back to a valid string schema (not empty) [0.450ms]
streamAntigravity: protobuf schema sanitization > converts oneOf to anyOf [0.630ms]
streamAntigravity: protobuf schema sanitization > merges allOf object branches into one object schema [0.870ms]
streamAntigravity: protobuf schema sanitization > recurses into array items that are objects (nested object arrays) [0.490ms]
streamAntigravity: protobuf schema sanitization > strips enum on non-string types but keeps the base type [0.470ms]
streamAntigravity: protobuf schema sanitization > keeps enum on string types [0.420ms]
streamAntigravity: protobuf schema sanitization > removes null branch from anyOf and sets nullable [0.430ms]
streamAntigravity: protobuf schema sanitization > sanitizes deeply nested union types at every depth [0.460ms]
streamGoogle: text streaming > emits start → text_start → text_deltas → text_end → done [0.490ms]
streamGoogle: text streaming > accumulates text content [0.340ms]
streamGoogle: text streaming > extracts usage from usageMetadata [0.300ms]
streamGoogle: text streaming > stopReason is 'stop' for STOP finishReason [0.290ms]
streamGoogle: tool calls > emits toolcall_start + toolcall_end for functionCall [0.310ms]
streamGoogle: tool calls > tool call has correct name and arguments [0.300ms]
streamGoogle: tool calls > stopReason is 'toolUse' when function calls present [0.260ms]
streamGoogle: thinking > emits thinking events for thought parts [0.300ms]
streamGoogle: thinking > captures thinking content and text [0.350ms]
streamGoogle: thinking > captures reasoning tokens in usage [0.300ms]
streamGoogle: thinking > interleaved thought parts stay positioned in order, not merged [0.410ms]
streamGoogle: thinking > interleaved thought parts emit paired start/end events per burst [0.410ms]
streamGoogle: error handling > HTTP error returns error event [0.230ms]
streamGoogle: error handling > null body returns error event [0.170ms]
streamGoogle: request format > sends to :streamGenerateContent URL with model [0.300ms]
streamGoogle: request format > sends x-goog-api-key header [0.230ms]
streamGoogle: request format > sends contents (not messages) in body [0.230ms]
streamGoogle: request format > includes systemInstruction when system prompt provided [0.250ms]
streamGoogle: request format > handles usageMetadata-only frames gracefully [0.240ms]
streamGoogle: request format > sends thinkingConfig when reasoning option is set [0.390ms]
streamGoogle: request format > does not send thinkingConfig when reasoning is absent [0.260ms]
streamGoogle: request format > thinkingBudget is -1 for high effort [0.260ms]
streamGoogle: request format > thinkingBudget is 128 for minimal effort [0.270ms]
streamGoogle: protobuf schema sanitization > converts nullable union type [T, null] to single type + nullable [0.500ms]
streamGoogle: protobuf schema sanitization > resolves $ref against $defs and strips unsupported keywords [0.330ms]
streamGoogle: protobuf schema sanitization > recurses into nested object arrays and converts oneOf to anyOf [0.380ms]
streamOpenAICompletions: text streaming > emits start → text_start → text_deltas → text_end → done [0.470ms]
streamOpenAICompletions: text streaming > accumulates text content [0.380ms]
streamOpenAICompletions: text streaming > extracts usage from final chunk [0.360ms]
streamOpenAICompletions: text streaming > stopReason is 'stop' [0.350ms]
streamOpenAICompletions: tool calls > emits toolcall events [0.580ms]
streamOpenAICompletions: tool calls > tool call has correct name and id [0.440ms]
streamOpenAICompletions: tool calls > stopReason is 'toolUse' [0.390ms]
streamOpenAICompletions: reasoning > emits thinking events for reasoning_content [0.400ms]
streamOpenAICompletions: reasoning > thinking content captured [0.320ms]
streamOpenAICompletions: reasoning > interleaved reasoning keeps each burst as a positioned block in order [0.380ms]
streamOpenAICompletions: reasoning > interleaved reasoning emits paired start/end events for each burst [0.340ms]
streamOpenAICompletions: error handling > HTTP error returns error event [0.190ms]
streamOpenAICompletions: error handling > null body returns error event [0.120ms]
streamOpenAICompletions: request format > sends Authorization Bearer header [0.240ms]
streamOpenAICompletions: request format > sends model and stream in body [0.250ms]
streamOpenAICompletions: request format > includes stream_options with include_usage [0.230ms]
streamOpenAIResponses: text streaming > emits start → text_start → text_deltas → text_end → done [2.06ms]
streamOpenAIResponses: text streaming > accumulates text in partial message [0.340ms]
streamOpenAIResponses: text streaming > extracts usage from response.completed [0.280ms]
streamOpenAIResponses: text streaming > stopReason is 'stop' for completed response [0.300ms]
streamOpenAIResponses: tool calls > emits toolcall events [0.360ms]
streamOpenAIResponses: tool calls > tool call has correct name and arguments [0.340ms]
streamOpenAIResponses: tool calls > stopReason is 'toolUse' when tool calls present [0.360ms]
streamOpenAIResponses: error handling > HTTP error returns error event [0.220ms]
streamOpenAIResponses: error handling > null body returns error event [0.130ms]
streamOpenAIResponses: request format > sends to /v1/responses [0.320ms]
streamOpenAIResponses: request format > sends Bearer auth header [0.280ms]
streamOpenAIResponses: request format > uses input (not messages) in body [0.260ms]
fetchFireworksUsage: returns null for missing API key > no apiKey returns null [0.480ms]
fetchFireworksUsage: normal usage report > returns report with monthly-spend-usd as primary limit [1.53ms]
fetchFireworksUsage: normal usage report > includes secondary limits [0.240ms]
fetchFireworksUsage: budget exhausted > status is exhausted when spend equals budget [0.250ms]
fetchFireworksUsage: budget warning at 90%+ > status is warning when usage >= 90% of budget [0.170ms]
fetchFireworksUsage: error handling > returns null when accounts endpoint fails [0.130ms]
fetchFireworksUsage: error handling > returns null when accounts list is empty [0.100ms]
fetchFireworksUsage: error handling > returns null when account has no name [0.090ms]
fetchFireworksUsage: error handling > returns null when quotas endpoint fails [0.100ms]
fetchFireworksUsage: error handling > returns null when quotas list is empty [0.090ms]
fetchFireworksUsage: metadata > includes accountId and displayName in metadata [0.160ms]
fetchFireworksUsage: metadata > includes accountId in scope for each limit [0.310ms]
fetchFireworksUsage: custom baseUrl > uses custom baseUrl for API calls [0.400ms]
codex constants > extractCodexAccountId returns claim from valid JWT [0.440ms]
codex constants > extractCodexAccountId returns undefined when claim missing [0.050ms]
codex constants > extractCodexAccountId returns undefined for invalid input [0.100ms]
codex constants > extractCodexProfile extracts accountId and normalized email [0.130ms]
codex constants > constants match expected Codex values [0.080ms]
transformCodexRequest > transforms basic user and assistant messages into input array [0.450ms]
transformCodexRequest > prepends system prompts as developer messages [0.070ms]
transformCodexRequest > maps tool calls and tool results [0.120ms]
transformCodexRequest > includes reasoning config when options.reasoning is set [0.040ms]
transformCodexRequest > always sets store false and stream true [0.040ms]
transformCodexRequest > disables parallel tool calls when tools are present [0.090ms]
parseCodexError > parses rate limit headers [0.820ms]
parseCodexError > maps usage_limit_reached to friendly message [0.150ms]
parseCodexError > maps rate_limit_exceeded to friendly message with reset time [0.180ms]
parseCodexError > handles non-JSON error bodies [0.180ms]
parseCodexError > returns normal error when no rate limit headers present [0.120ms]
copilot constants > getCopilotBaseUrl resolves public and enterprise hosts [0.150ms]
copilot constants > parseCopilotApiKey handles raw token and enterprise JSON payloads [0.120ms]
copilot constants > enterprise URL normalization rejects public GitHub hosts [0.110ms]
copilot constants > constants match expected Copilot values [0.060ms]

Mentioned in

Type Document
adr ADR-0013: Observability substrate is Langfuse, self-hosted, optional
adr ADR-0014: All LLM providers route through @kaged/llm; Mastra integrates via a LanguageModelV2 shim
adr ADR-0024: Context compaction is kaged-owned, layered, observable, and operator-tunable
adr ADR-0026: Cost management, model metadata overrides, and provider usage tracking
adr ADR-0028: 3rd-party OAuth provider auth — token lifecycle and credential management
spec Spec: Agent Harness
spec Spec: HTTP + WebSocket API
spec Spec: LLM Provider Interface
spec Spec: Local config