April 26, 2026·6 min read·#architecture#agents

Four agents, one harness — how the team actually works.

Bills, Invoices, Books, Treasury — separate UIs, separate jobs, but every action flows through the same propose / approve / sign spine.

We talk about Softmax two ways depending on the audience: as a team for ops buyers (“hire your AI finance team”) and as a harness for technical buyers (“the AI harness for crypto-native finance”). Same product, two framings. This post is the second framing.

A harness gives structure to power. Raw AI is messy and unpredictable; a harness makes it useful, safe, and aimed.

The four agents

Each agent is a coherent set of services, prompts, and UI surfaces with a clear job description. They are not chatbots. They don't have human names or avatars or voices — that's a deliberate brand choice. The user hires a team, not a personality.

Bills (AP). Reads your inbox + magic forwarding address, extracts every bill into structured data with a confidence score, queues payments for your approval.
Invoices (AR). Issues USDC invoices with card-or-any-token public payment links, drafts reminders for overdue ones, auto-reconciles incoming payments via webhook.
Books. Classifies every on-chain transaction into one of seven categories with confidence scores, reconciles against AP/AR, generates the monthly close packet, syncs to Xero or QuickBooks.
Treasury. Watches idle balances across wallets and chains, models near-term coverage from upcoming bills + expected inflows, proposes yield deposits + rebalances to audited vaults.

What they share

The agents have separate dashboards, separate inboxes, separate prompts. They share four primitives:

1. The action ledger

Every money-touching event — “Bills extracted invoice INV-0042, 87% confidence”, “Treasury proposed $8,000 deposit to Morpho USDC vault”, “You approved”, “Wallet 0x71f… signed tx 0xabc…” — lands in one append-only table. It's the sacred audit trail accountants live in. Sortable, filterable, immutable. Every agent writes; no agent reads-then-acts on it (no race-condition surface).

2. The propose / approve / sign spine

The universal money-moving flow:

An agent proposes — writes a row to action_proposals.
You approve in the UI.
The harness constructs an unsigned transaction.
Your wallet signs (Safe, EOA, embedded — doesn't matter).
The harness watches the on-chain receipt, parses Safe ExecutionSuccess / Failure events, flips the proposal to executed/failed, logs everything to the ledger.

AP uses it for paying bills. Treasury uses it for yield deposits and rebalances. AR uses it for void / refund flows. Same architecture; different proposing agent. New agents in v2 (Payroll? Tax? Compliance?) plug into the same loop.

3. The LLM service abstraction

One lib/ai-service.ts exposes a typeddefaultModel() for extraction / classification (Claude Sonnet) and anarrativeModel() for the close packet narrative (Claude Opus). One config flag swaps the provider. The agents never talk to Anthropic directly — they request a model, get a model, run generateObject with their own Zod schema. Per-workspace LLM token budgets enforce in the same chokepoint.

4. Per-workspace learning

Every correction is data. When you reassign a bill's vendor on the AP detail page, the original AI-extracted name becomes an alias for the chosen counterparty. Future extractions auto-match. When you re-categorize a Books transaction, the correction becomes a confidence-1 anchor that shapes future classifications. The moat isn't the prompts; it's the corrections.

Why “harness” and not “agent framework”

Agent frameworks let agents do anything. A harness explicitly narrows the action surface to a small typed set, gates execution on human signature, and audits everything. The bigger the model gets, the more important the harness gets — capability scales with structure, not against it.

That's why the term shows up in the marketing. We're not selling raw intelligence; we're selling structured, auditable, signed-by-you intelligence.