Spec: Operational Logging

  • Status: Draft
  • Last amended: 2026-05-31
  • Constrained by: ADR-0004, ADR-0005, ADR-0008, ADR-0013, ADR-0029
  • Implements: packages/utils/src/logger.ts (file sink), packages/storage/src/schema.ts (SQLite sink), packages/daemon/src/ (logger adoption + HTTP endpoints), packages/ui/src/components/log/ (log drawer)

Purpose

This spec defines the structured operational logging system for the kaged daemon. It covers the dual-sink write pipeline (SQLite + rotating files), the log schema, levels, sources, retention, the HTTP endpoints that feed the UI log drawer, the plugin logging protocol, and the local.toml configuration surface.

It is not normative for:

  • Audit logging (/api/v1/audit and the audit_events table) — that is a separate policy-level concern.
  • Langfuse tracing (ADR-0013) — that is LLM observability, handled via Mastra's native @mastra/observability + @mastra/langfuse pipeline, not operational logging.
  • The UI log drawer's visual design or component structure — that's ui/. This spec defines the data pipeline that feeds it.

Constraints (from ADRs)

Constraint Source
Runtime is Bun; no Node-isms in the daemon ADR-0004
Storage default is SQLite; portable SQL ADR-0005
Plugins are subprocesses over JSON-RPC on stdio ADR-0008
Operational logs are the fallback when Langfuse is absent ADR-0013 — Langfuse tracing via Mastra's native @mastra/observability pipeline

Log entry schema

Every log entry has this shape, written to both sinks:

interface OperationalLogEntry {
  /** ULID, client-generated. Unique across the table. */
  id: string;
  /** Epoch milliseconds. */
  ts: number;
  /** Log level. */
  level: "debug" | "info" | "warn" | "error";
  /** Source category — maps to UI filter chips. */
  source: "daemon" | "plugin" | "session" | "subagent";
  /** Human-readable message. Single line. */
  message: string;
  /** Project scope. null for daemon-level logs (startup, shutdown, config). */
  projectId: string | null;
  /** Session scope. null for non-session logs. */
  sessionId: string | null;
  /** Plugin name. Set when source is "plugin". null otherwise. */
  pluginName: string | null;
  /** Arbitrary structured context. JSON-serialised to TEXT in SQLite. */
  context: Record<string, unknown> | null;
}

SQLite table

CREATE TABLE IF NOT EXISTS logs (
  id          TEXT PRIMARY KEY,
  ts          INTEGER NOT NULL,
  level       TEXT NOT NULL,
  source      TEXT NOT NULL,
  message     TEXT NOT NULL,
  project_id  TEXT,
  session_id  TEXT,
  plugin_name TEXT,
  context     TEXT
);

CREATE INDEX IF NOT EXISTS idx_logs_level_ts ON logs(level, ts);
CREATE INDEX IF NOT EXISTS idx_logs_project_ts ON logs(project_id, ts);
CREATE INDEX IF NOT EXISTS idx_logs_session_ts ON logs(session_id, ts);
CREATE INDEX IF NOT EXISTS idx_logs_source_ts ON logs(source, ts);

File format

File sink writes one JSON object per line (NDJSON). Each line is a OperationalLogEntry serialised to JSON with an additional pid field. Example:

{"id":"01JX...","ts":1748700000000,"level":"error","source":"plugin","message":"Failed to preserve messages during compaction","projectId":"my-project","sessionId":"ses_abc","pluginName":"memory-markdown","context":{"compaction_id":"01JX...","error":"PluginCallContext validation failed"},"pid":12345}

Levels

Level Numeric priority When to use Production default
debug 0 Detailed internals: hook firing, tool registration, context resolution, plugin method calls Off
info 1 Normal operational events: startup complete, plugin loaded, session created, compaction completed On
warn 2 Unexpected but recovered: plugin restart, fallback path taken, config key deprecated On
error 3 Something failed: compaction error, plugin crash, storage write failure, gate failure On

Minimum level is configurable per-environment (see Configuration).

Sources

Source What emits Has projectId Has sessionId Has pluginName
daemon Daemon core: startup, shutdown, gate failures, config loading, internal errors Sometimes Sometimes Never
plugin Plugin lifecycle: load, hook fired, tool registered, error. Plugin stderr captures. Yes (project plugins) Yes (during hook) Always
session Session lifecycle: create, state transitions, compaction, idle, close Always Always Never
subagent Subagent invocations: spawn, cage setup, exit, errors Always Always Never

The source field directly maps to the UI's LogFilterKind type. The existing audit kind continues to be served by the /api/v1/audit endpoint — audit events are not stored in the logs table.

Dual-sink pipeline

Emitter (any daemon module)
    │
    ▼
 logger.write(entry)          ← single entry point in @kaged/utils
    │
    ├─► SQLite logs table      ← paginated queries from UI
    │
    └─► Rotating flat file     ← survival copy, grep, external shippers

Write semantics

  • SQLite write is synchronous within the write call. If the DB write fails, the error is swallowed (logging failures must never crash the daemon) and the file sink still receives the entry.
  • File write uses the existing @kaged/utils/logger.ts append semantics (O_WRONLY | O_APPEND | O_CREAT).
  • Ordering: SQLite write first, then file write. If SQLite fails, the file still gets the entry. This ensures the survival copy is always written even if the queryable copy fails.
  • Both sinks receive the exact same entry data. No sink-specific filtering — the level filter is applied before the dual write.

Configuration

local.toml gains a [logging] section:

[logging]
# Minimum log level. One of: debug, info, warn, error.
# Default: "warn" in production, "debug" in development.
level = "warn"

# Prune logs older than this many days. Default: 7.
retention_days = 7

# Maximum number of rows in the logs table. Prune oldest when exceeded.
# Default: 10000.
max_entries = 10000

# Override the file log directory. Default: platform-specific
# (Linux: ~/.local/state/kaged/logs, macOS: ~/Library/Logs/kaged).
dir = "/var/log/kaged"

# Mirror log entries to stderr. Default: false in production, true in development.
console = true

All fields are optional. When the [logging] section is absent entirely, environment-based defaults apply:

Setting Production (KAGED_ENV != "development") Development (KAGED_ENV=development or unset with NODE_ENV=development)
level warn debug
retention_days 7 7
max_entries 10 000 50 000
console false true

The daemon reads this config on boot and passes it to @kaged/utils/logger.ts's configure() function. Config changes require a daemon restart — no hot-reload of logging config in v0.

local-config schema addition

LocalConfigSchema in packages/local-config/src/schema.ts gains:

export const LoggingSchema = z.object({
  level: z.enum(["debug", "info", "warn", "error"]).optional(),
  retention_days: z.number().int().min(1).max(365).optional(),
  max_entries: z.number().int().min(100).max(1_000_000).optional(),
  dir: z.string().optional(),
  console: z.boolean().optional(),
});

// Added to LocalConfigSchema:
logging: LoggingSchema.optional(),

HTTP API

GET /api/v1/logs

Global daemon logs — unscoped. Returns the most recent entries across all projects and sessions.

GET /api/v1/projects/:id/logs

Project-scoped logs. Filters to entries where project_id = :id.

GET /api/v1/sessions/:id/logs

Session-scoped logs. Filters to entries where session_id = :id.

Query parameters (all endpoints)

Parameter Type Default Description
level string Filter to this level and above. One of: debug, info, warn, error.
source string Filter to a single source. One of: daemon, plugin, session, subagent.
since integer Only entries with ts >= since (epoch ms).
until integer Only entries with ts <= until (epoch ms).
q string Case-insensitive substring search on message. Server-side LIKE query.
limit integer 100 Maximum entries to return. Range: 1–500.
cursor string ULID-based pagination cursor. Entries with ts < cursor_ts (or same ts but id < cursor_id).

Response shape

{
  "entries": [
    {
      "id": "01JX...",
      "ts": 1748700000000,
      "level": "error",
      "source": "plugin",
      "message": "Failed to preserve messages during compaction",
      "projectId": "my-project",
      "sessionId": "ses_abc",
      "pluginName": "memory-markdown",
      "context": { "compaction_id": "01JX...", "error": "..." }
    }
  ],
  "cursor": "01JX...",
  "hasMore": true
}
  • entries: most recent first (descending ts, then descending id).
  • cursor: the id of the last entry returned. Pass as ?cursor=<value> to get the next page.
  • hasMore: true if there are more entries beyond this page.

Cursor semantics

The cursor is the id (ULID) of the last entry in the response. Since ULIDs are time-sortable, the next page queries WHERE ts < cursor_ts OR (ts = cursor_ts AND id < cursor_id). This avoids gaps from entries with identical timestamps.

Error responses

Status Code When
400 invalid_parameter level or source is not a valid value; limit out of range
404 not_found Project or session ID does not exist
500 internal Unexpected storage failure

Plugin logging

Project plugins (subprocess JSON-RPC)

Project plugins may emit structured log entries via a JSON-RPC notification (no id field, no response expected):

{
  "jsonrpc": "2.0",
  "method": "log",
  "params": {
    "level": "error",
    "message": "Failed to preserve messages during compaction",
    "context": {
      "compaction_id": "01JX...",
      "retained_count": 0
    }
  }
}
  • level: one of debug, info, warn, error. If absent or invalid, treated as info.
  • message: required, non-empty string. If absent, the notification is silently dropped.
  • context: optional, JSON object. Arbitrary structured fields.

The daemon writes these to both sinks with:

  • source: "plugin"
  • pluginName: <package name from manifest>
  • projectId and sessionId set from the current hook call context (if the log arrives during a hook invocation) or null (if the plugin logs outside a hook)

Plugin stderr capture

Per ADR-0008, plugin stderr is captured line-by-line by the daemon. Each line is written as a log entry with:

  • source: "plugin"
  • pluginName: <package name>
  • level: "error" (stderr lines are always treated as errors)
  • message: <the stderr line>
  • context: { capture: "stderr" }
  • projectId and sessionId: set from the plugin's last hook context, or null

Rate-limited: if a plugin emits more than 100 stderr lines in 10 seconds, subsequent lines are dropped and a single warn entry is written: "Plugin <name> stderr rate limit exceeded, N lines dropped".

System plugins (in-process)

System plugins already have PluginLogger in their SystemPluginContext. The implementation wraps calls to the dual-sink pipeline with:

  • source: "plugin"
  • pluginName: <system plugin name>
  • projectId: null (system plugins are not project-scoped)
  • sessionId: null

Retention enforcement

Retention is enforced at two points:

  1. On daemon boot — a background task (not blocking startup) prunes entries older than retention_days and/or exceeding max_entries. Runs after the HTTP server is listening.
  2. Periodically — every 6 hours, the daemon runs the same prune check. Interval is not configurable in v0.

Prune query (SQLite):

-- Time-based prune
DELETE FROM logs WHERE ts < (strftime('%s','now') * 1000 - :retention_ms);

-- Count-based prune (if still over max after time prune)
DELETE FROM logs WHERE id IN (
  SELECT id FROM logs ORDER BY ts ASC, id ASC
  LIMIT (SELECT COUNT(*) FROM logs) - :max_entries
);

File sink retention is handled by @kaged/utils/logger.ts's existing pruneOldFiles() — no changes needed.

Daemon integration

Logger adoption

The daemon's startup sequence (packages/daemon/src/main.ts) must:

  1. Parse local.toml logging config (or apply environment defaults).
  2. Call configure() from @kaged/utils/logger.ts with the resolved config.
  3. All console.error calls in the daemon are replaced with structured logger calls.

Logger API (daemon-internal)

A thin wrapper in the daemon provides the dual-sink write:

// packages/daemon/src/runtime/logger.ts (new file)
import * as fileLogger from "@kaged/utils/logger";
import { storage } from "./storage-ref";

type Level = "debug" | "info" | "warn" | "error";
type Source = "daemon" | "plugin" | "session" | "subagent";

interface LogOptions {
  source: Source;
  projectId?: string | null;
  sessionId?: string | null;
  pluginName?: string | null;
  context?: Record<string, unknown> | null;
}

function write(level: Level, message: string, opts: LogOptions): void {
  const entry = {
    id: generateUlid(),
    ts: Date.now(),
    level,
    source: opts.source,
    message,
    projectId: opts.projectId ?? null,
    sessionId: opts.sessionId ?? null,
    pluginName: opts.pluginName ?? null,
    context: opts.context ?? null,
  };

  // Sink 1: SQLite (best-effort, never throw)
  try {
    storage.writeLog(entry);
  } catch {}

  // Sink 2: File (via existing logger, best-effort)
  try {
    fileLogger[level](message, { ...entry, pid: process.pid });
  } catch {}
}

The daemon's modules import and call write() instead of console.error(). Convenience shorthands (logDaemon.error(), logSession.info(), etc.) are defined to reduce boilerplate.

Migration: console.error → structured logger

All 42 console.error calls in packages/daemon/ must be migrated to structured logger calls. Each call becomes:

// Before:
console.error(`Fatal: gate ${failedGate} failed: ${message}`);

// After:
write("error", `Gate ${failedGate} failed: ${message}`, {
  source: "daemon",
  context: { gate: failedGate, gateMessage: message },
});

This is mechanical. The message field is human-readable. Structured data goes into context. The source is set based on which module emits the log.

UI integration

SSE log streaming

Per ADR-0030, live log entries are pushed to the UI via Server-Sent Events when the log drawer is open.

Log drawer data flow

UI LogDrawer opens
    │
    ├─► GET /api/v1/projects/:id/logs?limit=100
    │       │
    │       ▼
    │   Response: { entries, cursor, hasMore }  ← historical backlog
    │
    ├─► EventSource: /api/v1/projects/:id/logs/stream
    │       │
    │       ▼
    │   event: log → { id, ts, level, ... }     ← live entries
    │
    ▼
  Merge: backlog + live, dedup by id
    │
    ▼
  LogStream renders entries (most recent at top)
    │
    ▼ User scrolls to bottom
    │
    ▼
  GET /api/v1/projects/:id/logs?limit=100&cursor=<cursor>
    │
    ▼ Append to entries, update cursor

SSE subscription lifecycle

  1. Drawer opens: fetch backlog via HTTP, open EventSource.
  2. Drawer open: accumulate live entries from SSE, merge with backlog, dedup by id.
  3. Drawer closes: call eventSource.close(). No further SSE traffic.
  4. Reconnect: EventSource auto-reconnects. On reconnect, the stream resumes from the current time (no replay gap-filling — the client fetches missing entries via HTTP if needed).

Subscriber registry (daemon-internal)

The daemon's logger module maintains an in-memory subscriber list:

interface LogSubscriber {
  projectId: string;
  level?: string;
  source?: string;
  callback: (entry: OperationalLogEntry) => void;
}

The write() function fans out to matching subscribers after writing to both sinks. Filtering is in-memory (level + source match). The SSE handler creates a subscriber on connection and removes it on disconnect.

LogEntry type alignment

The existing UI LogEntry type gains fields to match the API response:

// Updated packages/ui/src/components/log/types.ts
export type LogFilterKind = "daemon" | "plugin" | "session" | "subagent" | "audit";

export interface LogEntry {
  id: string;
  ts: number;
  level: "debug" | "info" | "warn" | "error";
  kind: LogFilterKind;       // renamed from source for UI compat
  message: string;
  projectId: string | null;
  sessionId: string | null;
  pluginName: string | null;
  context: Record<string, unknown> | null;
}

LogFilterKind gains "plugin" as a first-class source (previously plugin logs were lumped under "daemon").

Filter chips

Updated filter set: daemon, plugin, session, subagent, audit. The audit chip queries /api/v1/audit (existing endpoint) and merges results into the same timeline by timestamp. The other chips query the /logs endpoints.

String search

The search input sends q=<query> to the server. The daemon performs a case-insensitive LIKE '%query%' on the message column. No client-side filtering for search — all filtering is server-side to keep large log sets efficient.

Testing notes

  • Unit tests: the write() function is testable by mocking the storage layer and file logger. Verify correct sink routing, level filtering, and entry shape.
  • Integration tests: boot daemon → emit log → query /api/v1/logs → verify entry appears with correct fields. Test pagination, filtering, cursor behavior.
  • Retention tests: insert N entries, run prune, verify count constraint and age constraint.
  • Plugin log tests: send a log JSON-RPC notification → verify it appears in both sinks.
  • Plugin stderr tests: write to plugin stderr → verify captured as source: "plugin", context.capture: "stderr".
  • Rate limit tests: exceed 100 stderr lines in 10s → verify rate-limit warning entry.

Open questions

  • FTS5 for search: SQLite FTS5 would enable faster full-text search on message and context. For v0, LIKE is sufficient. If log volumes grow, FTS5 can be added as an index migration.
  • Log export endpoint: should there be a GET /api/v1/logs/export that dumps logs as NDJSON for external consumption? Useful but not required for v0.

Real-time log streaming: should the log drawer auto-update via WebSocket? Deferred to a future iteration. The initial implementation is request/response pagination. Resolved by ADR-0030 — log streaming uses Server-Sent Events, not WebSocket. See the SSE endpoint in http-api.md.

References