for engineers
Forge for engineers.
Production-grade agents in 5 minutes, not 2 months.
Autonomous by default. Conversational by nature.
The autonomy model
Forge agents are autonomous by default for routine work: reads, drafts, internal writes, scheduling, computations, web search, file ops, self-notes. They execute without asking. For 10 designated high-risk categories, the agent conversationally confirms with you before executing. The inverse of approval-queue platforms: instead of asking permission for everything, the agent asks permission for nothing routine but checks in before doing anything irreversible.
Categories that trigger conversational confirmation
| Category | Example | Why high-risk |
|---|---|---|
| email_external | Send email to a customer | Once sent, can't unsend |
| slack_channel_post | Post to #deals | Team-wide visibility |
| sms_external | SMS a customer | Intrusive, immediate |
| hubspot_customer_write | Update deal stage | Affects team reporting |
| github_pr_create | Open a PR | Notifies reviewers, runs CI |
| github_push | Push to origin | Triggers CI, deploys |
| code_modification_self | Edit own runtime | Can break next boot |
| spend_over_threshold | Sub-agent at $15 | Token spend control |
| strategic_account_action | Touch top-10 account | Critical relationships |
| quiet_hours_external | Email at 2am | Professionalism |
What conversational confirmation looks like
Inline in the TUI. Slack DM and SMS confirmation are available when configured (Phase 2).
agent › About to send email
to: sarah@acmecorp.com
subject: Re: pricing follow-up
category: email_external
─── draft ───
Hi Sarah, following up on Tuesday's call.
Attached the term sheet you asked for. Let me know
if Friday 2pm works for the next step.
─────────────
[a]pprove [e]dit [s]kip [t]rust this category
you › a
sent. logged to ~/forge/audit/confirmations.jsonlStanding orders override this. Trust the agent with a category? Add it to STANDING_ORDERS.md. Want extra gates? Add them too. The 10 categories are defaults, not a cage.
The 30-second pitch
- ›30 first-class capabilities, all wired by default. No glue code, no per-tool registration boilerplate.
- ›96 tools per agent (43 built-in + 53 capability sub-tools), uniformly typed and discoverable.
- ›325 passing tests, CI-enforced tool/runtime alignment.
test_capability_docs_runtime_audit.pyfails the build if a capability declares tools that don't register. - ›3 deployment modes: Solo Local
(--tui), Headless Gateway(--gateway), Bare REPL. - ›Self-modifying agents:
code_create/code_edit/code_rollbackwith install-dir sandbox and Python-syntax auto-rollback. - ›Cross-platform: macOS, Linux, Windows verified end-to-end (Dan ran macOS, tdemm ran Windows, same agent).
What you get vs what you write
Concrete comparison. Both columns are the same agent surface.
| Concern | Building your own | Forge |
|---|---|---|
| Tool registration | LangChain: 100+ lines of glue per tool | 0 lines. Catalog declares it, generator wires it. |
| Auth handling | You maintain retry, error, token refresh per service | Standardized across all 32 integrations |
| Memory | You wire SQLite + embeddings + decay yourself | A-MEM ships in the box (fastembed + graph + decay) |
| Voice / identity | Ad-hoc prompt engineering, drifts over time | SOUL.md / IDENTITY.md / USER.md prepended to every call |
| Scheduling | You wire cron + tzdata + retry semantics | scheduled_commitments + cron_create + heartbeat loop |
| Cross-platform | You debug Windows tzdata, terminal escapes, path sep | Handled (PR #108 ships tzdata; Rich TUI portable) |
| Security | You grep for /etc/passwd-style bugs at 2am | Sandbox tested 10/10 adversarial (PR #113) |
Three lines of code that should sell it
From scout-config to running TUI. No build step.
# 1. Generate the agent from your scout-config
curl -X POST https://forge.example/api/generate-agent \
-d @scout-config.json -o agent.zip
# 2. Unzip
unzip agent.zip && cd agent
# 3. Run
python -m my_agent --tuiArchitecture in 90 seconds
Three components. Each one does one thing.
01
Architect
Phase 1.5 context interview. 5 waves, 27+ fields. Captures voice, role, hard rules, escalation triggers into a scout-config.
TypeScript + Anthropic
02
Generator
Reads scout-config, emits a Python package: workspace files, plugins, capability wiring, three deployment entrypoints.
Python
03
Runtime
scout_runtime — vendored into every zip. Tool registry, sandbox, A-MEM, heartbeat, gateway, sub-agent dispatch.
Python, no SaaS deps
Single source of truth. The Python schema defines every capability and tool. JS bindings are auto-generated via sync-python.js. When schema drifts, CI breaks the build, not production.
What's in the catalog
All 32 capabilities. All wired by default (F34 doctrine, PR #104 enforces).
How it's different from LangChain / CrewAI
| LangChain / CrewAI | Forge |
|---|---|
| Library you assemble | Complete agent generated from your spec |
| 100+ lines of glue per integration | 0 lines. Catalog wires it. |
| You maintain abstractions when LLM APIs change | Handled in the platform. Runtime is versioned, vendored. |
| CrewAI: agent-crew abstractions | 32 capabilities + 96 tools + identity + memory shipped in one zip |
| No standardized voice / persistent identity | SOUL.md / USER.md / IDENTITY.md prepended to every call |
| You wire memory + scheduling + heartbeat yourself | A-MEM + scheduled_commitments + heartbeat all default |
How it's different from rolling your own
You could build it. It takes 2-3 months for equivalent surface. And along the way you'd re-discover everything we already debugged. Sample from the last week:
- PR #98 / #99Capability docs / runtime alignment — 17 audit tests fail the build when docs lie about which tools exist
- PR #101Workspace freshness check on every boot — refreshes stale system files from package while preserving MEMORY.md / USER.md
- PR #102Memory consistency check — warns when MEMORY.md references tools the registry no longer ships
- PR #104Auto-wire all 32 capabilities — even bare specs get the full surface
- PR #105Server-side auto-wire before validation — catches architect under-wiring
- PR #106HubSpot integration (29th capability) + 7 dedicated tests
- PR #107Tolerant scout-config parser — recovers from truncated JSON emissions across message boundaries
- PR #108Generated agents ship tzdata for Windows scheduling
- PR #109Three extensibility paths documented: conversational, standing orders, plugins
- PR #111code_modification capability — agents write/edit/rollback their own code, auto-rollback on Python syntax error
- PR #112PowerShell TUI quickstart in docs — Windows users were guessing
- PR #11310/10 adversarial sandbox tests pass — /etc/passwd, ~/.ssh, symlink escape, relative traversal, install-dir walk-up
53 PRs in one session. Latest commit f582e4d. Every merge had full diff, clear message, critical tests green before --admin.
Get started
- 1.Visit /create and paste your Anthropic API key.
- 2.Design through the Phase 1.5 interview. Download the zip.
- 3.Run
python -m <agent> --tui.