Four agents, one harness — how the team actually works.
Bills, Invoices, Books, Treasury — separate UIs, separate jobs, but every action flows through the same propose / approve / sign spine.
We talk about Softmax two ways depending on the audience: as a team for ops buyers (“hire your AI finance team”) and as a harness for technical buyers (“the AI harness for crypto-native finance”). Same product, two framings. This post is the second framing.
A harness gives structure to power. Raw AI is messy and unpredictable; a harness makes it useful, safe, and aimed.
The four agents
Each agent is a coherent set of services, prompts, and UI surfaces with a clear job description. They are not chatbots. They don't have human names or avatars or voices — that's a deliberate brand choice. The user hires a team, not a personality.
- Bills (AP). Reads your inbox + magic forwarding address, extracts every bill into structured data with a confidence score, queues payments for your approval.
- Invoices (AR). Issues USDC invoices with card-or-any-token public payment links, drafts reminders for overdue ones, auto-reconciles incoming payments via webhook.
- Books. Classifies every on-chain transaction into one of seven categories with confidence scores, reconciles against AP/AR, generates the monthly close packet, syncs to Xero or QuickBooks.
- Treasury. Watches idle balances across wallets and chains, models near-term coverage from upcoming bills + expected inflows, proposes yield deposits + rebalances to audited vaults.
What they share
The agents have separate dashboards, separate inboxes, separate prompts. They share four primitives:
1. The action ledger
Every money-touching event — “Bills extracted invoice INV-0042, 87% confidence”, “Treasury proposed $8,000 deposit to Morpho USDC vault”, “You approved”, “Wallet 0x71f… signed tx 0xabc…” — lands in one append-only table. It's the sacred audit trail accountants live in. Sortable, filterable, immutable. Every agent writes; no agent reads-then-acts on it (no race-condition surface).
2. The propose / approve / sign spine
The universal money-moving flow:
- An agent proposes — writes a row to
action_proposals. - You approve in the UI.
- The harness constructs an unsigned transaction.
- Your wallet signs (Safe, EOA, embedded — doesn't matter).
- The harness watches the on-chain receipt, parses Safe ExecutionSuccess / Failure events, flips the proposal to executed/failed, logs everything to the ledger.
AP uses it for paying bills. Treasury uses it for yield deposits and rebalances. AR uses it for void / refund flows. Same architecture; different proposing agent. New agents in v2 (Payroll? Tax? Compliance?) plug into the same loop.
3. The LLM service abstraction
One lib/ai-service.ts exposes a typeddefaultModel() for extraction / classification (Claude Sonnet) and anarrativeModel() for the close packet narrative (Claude Opus). One config flag swaps the provider. The agents never talk to Anthropic directly — they request a model, get a model, run generateObject with their own Zod schema. Per-workspace LLM token budgets enforce in the same chokepoint.
4. Per-workspace learning
Every correction is data. When you reassign a bill's vendor on the AP detail page, the original AI-extracted name becomes an alias for the chosen counterparty. Future extractions auto-match. When you re-categorize a Books transaction, the correction becomes a confidence-1 anchor that shapes future classifications. The moat isn't the prompts; it's the corrections.
Why “harness” and not “agent framework”
Agent frameworks let agents do anything. A harness explicitly narrows the action surface to a small typed set, gates execution on human signature, and audits everything. The bigger the model gets, the more important the harness gets — capability scales with structure, not against it.
That's why the term shows up in the marketing. We're not selling raw intelligence; we're selling structured, auditable, signed-by-you intelligence.