Pillar guide
Continuous learning agents
Continuous learning agents improve on the harness layer — prompts, tools, workflows, memory, evals — using production signal, in a latent space outside the model weights and across any model.
In one paragraph
The shortest accurate definition
A continuous learning agent is an AI system whose harness layer — prompts, tools, workflows, retrieval, memory, evals — is systematically updated using production signal, in a latent space outside the model weights and portable across whichever model you swap in.
Most teams reach for fine-tuning when their agent regresses. They shouldn't, at least not first. The model weights are usually not the bottleneck — the harness is. Prompts drift out of alignment with how users actually phrase intents. Tool schemas decay as downstream APIs evolve. Retrieval indices stop covering new clusters. The agent's behavior in production has more to do with the harness than with which model is mounted underneath it. Continuous learning, done right, is the operating discipline that keeps the harness layer current — model-agnostic, inspectable, and reversible.
Updated
Pattern
Why the harness, not the weights
Updating the model weights is a heavy, opinionated, hard-to-reverse change. Updating the harness is the opposite — fast, inspectable, and reversible. Most teams improve more in a week of harness edits than in a month of fine-tuning, and the improvement is portable across models.
- Portable: harness-layer learnings apply when you swap the underlying model.
- Inspectable: changes are prompts, tools, schemas, retrieval indices — human-readable artifacts, not gradient updates.
- Reversible: a regression rolls back as a harness state change, not a model checkpoint promotion.
- Per-user: the same harness can adapt to each user without retraining a shared model.
- Continuous: ship in minutes, not in quarters.
Pattern
The three signals every harness-layer loop needs
A learning loop that lacks any of these signals collapses into anecdote. The signal triplet is what makes production data actionable on the harness layer.
- Intent: what users are actually trying to do, captured as a live hierarchical taxonomy across the full conversation population.
- Behavior: where the agent breaks down at the trajectory level — tool call failures, schema drift, context loss, agent laziness, hallucinations, reasoning loops, goal drift.
- Frustration with attribution: trigger turn, trajectory, affected goal, and an agent counterfactual, routed to the layer of the harness that needs to change (prompt, tool, workflow, context, memory, eval, or model).
Pattern
Common harness-layer update patterns
Continuous learning rarely starts with model fine-tuning. The fastest loops use the cheapest harness edit that moves the metric.
- Prompt edits seeded from clustered failure exemplars — the highest-frequency loop in most production teams.
- Tool schema and routing changes driven by detected tool call failures and schema drift.
- Retrieval index expansion to cover under-served intent clusters.
- Workflow restructuring when agent path analysis reveals loops or premature handoffs.
- Eval-set regeneration from intent clusters so eval coverage tracks production reality.
- Per-user context state in the harness — learnings stored outside model weights, applied at request time.
- Model fine-tunes reserved for stable, well-evaluated patterns where lighter harness edits have been exhausted.
Pattern
Metrics that actually move
Vanity metrics — average sentiment, session length, generic CSAT — do not reflect the health of a harness-layer learning loop. The metrics below do.
- Intent coverage: share of production conversations matched by a known intent cluster.
- Emergent intent rate: how quickly new user intents appear and how quickly they get covered.
- Behavioral failure rate: incidents per 100 conversations across the failure taxonomy.
- Frustration share: share of conversations with detected frustration events.
- Time-to-fix: median time from failure detection to a shipped harness edit.
- Regression rate: share of harness edits that improved one cluster at the cost of another.
Pattern
Failure modes to plan for
Every learning loop has a failure mode. Knowing the canonical ones up front buys back months of debugging.
- Reaching for fine-tuning first when a harness edit would have worked.
- Operational catastrophic forgetting: a prompt edit that improves intent A regresses intent B.
- Reward hacking: chasing user thumbs or session length instead of true task completion.
- Eval drift: the eval set stops resembling production traffic.
- Closed-loop bias: only learning from users who complain, ignoring the silent majority that abandons.
Frequently asked
Questions
What is the harness layer?
The harness layer is everything around the model call that shapes agent behavior — system prompts, tool definitions and routing, workflow orchestration, retrieval indices, memory state, evals, and guardrails. Most agent failures live here, not in the model itself. Improving the harness is fast, inspectable, and reversible; updating the model weights is none of those things.
What's the difference between continuous learning and fine-tuning?
Fine-tuning updates the model's weights. Continuous learning on the harness updates everything around the model — prompts, tools, workflows, context, memory, evals. Most production agents continuously learn for months or quarters before any weights ever move. The reason: harness edits are model-agnostic and per-user-adaptable; weight updates are neither.
Why keep learnings outside the model weights?
Portability and reversibility. Learnings outside the weights apply when you swap models, adapt per user without retraining, are human-readable, and revert cleanly when a regression appears. Learnings baked into weights are the opposite — model-specific, shared across users, opaque, and hard to roll back.
How do I measure whether continuous learning is working?
Track intent coverage, emergent intent rate, behavioral failure rate, and frustration share over time. If those numbers improve while time-to-fix shortens, the loop is real. If they oscillate while sentiment scores improve, you are probably reward-hacking.
Where does Moda fit?
Moda is self-improvement for AI agents on the harness layer. We turn production conversations into the signal — intent map, emergent intents, behavioral failures, frustration root cause with agent counterfactual — and attribute each event to a specific layer of the harness so your team knows what to change next. The improvements live in a latent space outside the model weights, which is why they apply to any model you mount underneath.
See continuous learning agents on your traffic.
Moda turns production conversations into the production signal these loops need: intent clusters, behavioral failure exemplars, frustration root causes.