reviewed quarterly · last review 2026-07

The Agentic AI Radar

This hub publishes evergreen guides, not news — the radar is how we track a fast-moving field anyway. Each quarter, every item gets re-argued and moved, added, or retired. Not a leaderboard; an opinionated map with reasons.

Adopt Proven. Use it for real work today.

MCP for tool connectivity protocols

The de-facto connector standard; broad client support across vendors. Deep dive →
Golden-task evals in CI practices

The single highest-leverage practice for agent reliability. Deep dive →
LLM gateways (routing, budgets, keys) infrastructure

One choke point for cost, access, and fallback across providers. Deep dive →
OTel GenAI semantic conventions infrastructure

Vendor-neutral agent telemetry; stabilized and widely implemented.
Trajectory logging practices

If you cannot replay what the agent did, you cannot debug or defend it. Deep dive →
Structured outputs / schema-constrained calls practices

Eliminates a whole class of parse-and-pray failures between steps.

Trial Promising. Use it on projects that can absorb change.

A2A for cross-org agent delegation protocols

Real momentum, still-consolidating semantics; keep adapters thin. Deep dive →
AG-UI / agent-native frontends protocols

Streaming agent state to UIs beats chat-only UX; standards still settling.
Durable-execution agent frameworks frameworks

Resumable long-running trajectories are the strongest reason to adopt a framework. Deep dive →
Local serving for agent workloads models

llama.cpp/vLLM-class serving is production-grade; model capability is the constraint. Deep dive →
Computer-use agents for narrow workflows practices

Works for constrained, verifiable tasks; supervise anything open-ended.
LLM-as-judge with human calibration practices

Scales fuzzy evaluation — if you routinely audit the judge against humans.

Assess Watch it. Prototype if it matches a real need.

Small models as tool-routers models

Cheap local routing in front of frontier planners; promising cost profile.
Agent memory interchange standards protocols

Everyone rebuilds memory; portable standards are early but worth watching. Deep dive →
Agentic browsers frameworks

Powerful demos; security model against injection still being proven. Deep dive →
MCP beyond tools (apps, UI, elicitation) protocols

The protocol is growing surface area fast; adopt the core, assess the edges.

Hold Proceed with caution — known traps at current maturity.

Fine-tuning to inject knowledge models

Wrong tool for facts: no provenance, no erasure, slow updates. Retrieve instead. Deep dive →
Prompt-only security practices

"We told it not to" is not a control. Enforce at the tool layer. Deep dive →
Unsupervised writes to systems of record practices

Irreversible + autonomous + unaudited is how agents make the news. Deep dive →
One mega-agent with every tool frameworks

Context bloat, contaminated trajectories, unreviewable capability surface. Deep dive →