Agent frameworks vs. rolling your own harness: build or buy the loop?
LangGraph, CrewAI, the OpenAI and Claude SDKs — or 200 lines of your own code? A decision framework based on what agent frameworks actually provide.
The agent loop is about 40 lines of code. Agent frameworks are tens of thousands. That gap is either valuable machinery or expensive indirection, and which one depends on your system — not on the framework’s README.
What frameworks actually sell
Strip the marketing and every agent framework bundles five things:
| Capability | Build-it-yourself cost | Notes |
|---|---|---|
| The loop itself | An afternoon | Genuinely trivial — never adopt a framework for this |
| Provider abstraction | Low | A gateway (LiteLLM et al.) does this better anyway |
| State & resumability | High | Durable, replayable multi-step state is real engineering |
| Parallel / graph orchestration | High | Fan-out, joins, retries across agents |
| Observability hooks | Medium | OTel GenAI conventions are closing this gap |
The honest summary: frameworks are cheap substitutes for the two rows that are expensive, and expensive substitutes for the three rows that are cheap.
When rolling your own wins
- One agent, < ~10 tools, request-scoped state. The loop + your language’s stdlib is less code than the framework’s config.
- You need to understand failures deeply — every abstraction layer between you and the model call is a layer you’ll debug through at 2 a.m.
- Tight latency or cost budgets. Frameworks add turns, tokens, and serialization you didn’t choose.
When the framework wins
- Long-running, resumable work — jobs that survive a process restart mid-trajectory. Durable execution is the single strongest reason to buy.
- Genuine multi-agent graphs — parallel branches with joins and per-branch retry policies get ugly fast in hand-rolled code.
- Team scale — five teams hand-rolling five harnesses is five sets of conventions to review; a shared framework is a shared vocabulary.
The trap to avoid either way
Coupling your business logic to the harness. Tools, prompts, and evals should be plain functions and data that would survive a framework swap — the eval harness you build should run against a LangGraph agent and a hand-rolled one identically. Teams that keep that boundary clean can defer this whole decision; teams that don’t have already decided, permanently, by accident.
Was this guide useful?
Thanks — noted. It shapes what gets written next.
newsletter
One practical agentic-AI guide in your inbox. No news, no hype.
Tutorials and decision frameworks as they ship. Unsubscribe anytime.