Wire Format and Codec System
The wire package (internal/wire/) provides a pluggable codec registry that encodes and decodes the graph payloads produced by context packing, MCP tools, and the export CLI. Core GCF types and encoding are provided by the standalone gcf-go library; the wire package re-exports them and adds knowing-specific codecs (binary, JSON) via the registry. Three built-in codecs serve different layers of the system; additional codecs can be registered at runtime.
Codec Registry
The registry is a thread-safe map of named codecs. Each codec implements an Encoder (Payload to string) and a Decoder (string to Payload). The public API:
| Function | Purpose |
|---|---|
wire.Register(codec) |
Add a codec to the registry (panics on duplicate name) |
wire.EncodeWith(name, payload) |
Encode a payload using the named codec |
wire.DecodeWith(name, input) |
Decode a string back into a payload using the named codec |
wire.Get(name) |
Retrieve a codec by name |
wire.List() |
Return all registered codecs (sorted) |
wire.ListNames() |
Return comma-separated list of registered codec names |
Built-In Codecs
| Codec | Format | Use Case | Savings |
|---|---|---|---|
| gcf (Graph Compact Format) | Text, graph-native line protocol | Agent/LLM consumption. Token-optimized with structured delimiters. | 84.0% token savings vs JSON (median) |
| gcb (Graph Compact Binary) | Varint + length-prefixed binary | Daemon IPC, caching, transport between services. Magic header GCB1, version byte, packed symbols and edges. |
74.1% byte savings vs JSON (median) |
| json | Standard JSON | Human/debug use, compatibility baseline. Maximum readability, verbose. | (baseline) |
Layered Architecture
The three codecs map to distinct system layers:
┌──────────────────────────────────────────────────────┐
│ Agent / LLM Context Window │
│ Format: GCF (text, token-efficient, 84% savings) │
├──────────────────────────────────────────────────────┤
│ Daemon IPC / Computation Cache / Storage │
│ Format: GCB (compact binary, fast parse, 74%) │
├──────────────────────────────────────────────────────┤
│ Human Debugging / Export CLI / Tests │
│ Format: JSON (readable, compatible) │
└──────────────────────────────────────────────────────┘
- GCF is the default for MCP tool responses and context packing output. It minimizes token consumption inside LLM context windows while remaining plain-text parseable.
- GCB is used for daemon-to-daemon communication and the content-addressed computation cache. Its varint+length-prefixed layout avoids parsing overhead and produces compact byte streams.
- JSON serves as the compatibility baseline for
knowing export, debugging, and integration with external systems that expect standard serialization.
GCF Session Deduplication
The MCP server maintains a per-connection gcf.Session (from gcf-go) that tracks which symbols have already been transmitted to the client. On subsequent GCF responses within the same connection, previously-sent nodes are emitted as bare references (@N # previously transmitted) rather than complete symbol records. gcf.EncodeWithSession partitions symbols into new (full declaration) and known (bare reference) before encoding.
GCF Delta Encoding
When the agent sends a pack_root from a prior call and the current result differs, the server computes a structural diff (internal/context/delta.go) and returns only what changed via gcf.EncodeDelta (from gcf-go). The delta format uses ## removed, ## added, ## edges_removed, ## edges_added sections. A 60% threshold ensures delta is only used when it saves meaningfully over full retransmission.
Benchmark (session 27, bench/delta-packing/): 81.2% token savings at 96.6% symbol overlap on re-query scenarios. See docs/architecture/context-packing.md for full protocol.
Three-Level Token Savings Stack
| Level | What it does | Savings | When |
|---|---|---|---|
| GCF baseline | Compact line protocol vs JSON | 84% | Every response |
| Session dedup | Bare references for previously-sent symbols | Additional ~47% on repeats | Multi-turn conversations |
| Delta encoding | Only added/removed symbols transmitted | 81% on re-queries | Same task, pack changed |
Binary Wire Layout
[magic:4][version:1][header][symbols...][edges...]
Header: tool(str) tokens_used(varint) token_budget(varint) num_symbols(varint) num_edges(varint)
Symbol: qname(str) kind(uint8) score(float32) provenance(uint8) distance(uint8) signature(str) components(4xfloat32)
Edge: source_idx(varint) target_idx(varint) edge_type(uint8) status(uint8)
Symbols are indexed by position; edges reference symbols by their zero-based index, avoiding repeated string encoding.
Core Types
Payload
The Payload struct (defined in gcf-go, re-exported by internal/wire/gcf.go) is the universal input/output for all codecs:
type Payload struct {
Tool string // MCP tool name (e.g., "context_for_task")
TokensUsed int // actual tokens consumed
TokenBudget int // requested budget
PackRoot string // content-addressed identity (64-char hex hash)
Symbols []Symbol
Edges []Edge
}
Symbol
Each symbol carries its qualified name, kind, relevance score, provenance tier, graph distance from seeds, optional signature, and score component breakdown:
type Symbol struct {
QualifiedName string
Kind string // function, type, method, interface, etc.
Score float64
Provenance string // lsp_resolved, ast_inferred, etc.
Distance int // 0=target, 1=related, 2+=extended
Signature string
Components Components // BlastRadius, Confidence, Recency, Distance
}
Edge
Edges reference symbols by qualified name. The Status field supports diff responses:
type Edge struct {
Source string // qualified name of source symbol
Target string // qualified name of target symbol
EdgeType string // calls, implements, imports, etc.
Status string // "added", "removed", "unchanged" (for diff responses)
}
DeltaPayload
Used by EncodeDelta for incremental context delivery:
type DeltaPayload struct {
Tool string
BaseRoot string // pack_root the agent has
NewRoot string // pack_root of the current result
Removed []Symbol
Added []Symbol
RemovedEdges []Edge
AddedEdges []Edge
DeltaTokens int
FullTokens int
}
Bridge: ContextBlock to Payload
FromContextBlock (internal/wire/bridge.go) converts the internal ContextBlock (from the context engine) into a wire Payload. If the block already has edges, those are used directly. Otherwise, edges between included symbols are discovered from the store via EdgesFrom queries. This bridge is the boundary between the retrieval layer and the wire layer.
Benchmark Harness
The bench/wire-format/ directory contains a benchmark suite that measures encoding size, token count, and round-trip fidelity across six fixture cases in cases/:
| Fixture | Scenario |
|---|---|
cases/01_context_for_task_small.yaml |
Small task context (few symbols) |
cases/02_context_for_task_medium.yaml |
Medium task context (typical agent query) |
cases/03_context_for_files.yaml |
File-based blast radius expansion |
cases/04_blast_radius.yaml |
Full blast radius output |
cases/05_semantic_diff.yaml |
PR semantic diff payload |
cases/06_graph_query.yaml |
Raw graph query result |
Run benchmarks with GOWORK=off go test -bench=. ./bench/wire-format/.
Results tracked in:
- bench/wire-format/scorecard.md: savings ratios against JSON baseline
- bench/wire-format/FINDINGS.md: detailed per-case analysis with interpretation
Latest results: GCF 84.0% median token savings, GCB 74.1% median byte savings.
Source Files
| File | Purpose |
|---|---|
gcf-go |
Standalone GCF library: Payload/Symbol/Edge/Components types, Encode, Decode, Session, EncodeWithSession, DeltaPayload, EncodeDelta |
gcf spec |
GCF specification v1.0: grammar, encoding rules, session statefulness, delta extension |
internal/wire/gcf.go |
Type aliases and delegating wrappers re-exporting gcf-go for backward compatibility |
internal/wire/binary.go |
GCB binary encoder/decoder, varint layout, kind/provenance/edge-type ID maps |
internal/wire/json.go |
JSON encoder/decoder (compatibility baseline) |
internal/wire/registry.go |
Codec registry (Register, Get, List, EncodeWith, DecodeWith) |
internal/wire/bridge.go |
FromContextBlock: converts ContextBlock to wire Payload with edge discovery |
bench/wire-format/bench_test.go |
Encoding size, token count, and round-trip benchmarks |
bench/wire-format/scorecard.md |
Auto-generated savings scorecard |
bench/wire-format/FINDINGS.md |
Detailed benchmark results and interpretation |