LiteLLM Standalone
First Principles: Should AI routing become shared infrastructure?
The Core Question
Three questions stacked: (1) Do we extract routing into shared infra? (2) If yes, where does it live? (3) What does BYOK mean for this decision?
What ClawdRouter Actually Is
A cascade LLM router on the Forge VPS (port 8080). It wraps LiteLLM — which itself wraps multiple providers — and adds cascade logic: try Gemini 2.5 Flash first (cheap, fast), fall back to Claude Sonnet, then GPT-4o, then Groq. Cost optimization baked in. Currently Forge-internal only. MasteryOS has zero connection to this chain.
Current State: Forge VPS Internal Only
MasteryOS on AWS EC2 — no connection to Forge routing
The Four Options
A: Open Forge Port to MasteryOS
- Add MasteryOS IP to ClawdRouter allowlist
- MasteryOS calls Forge VPS directly
- Forge down = MasteryOS AI down
- Hard SLA coupling on a revenue product
- Doesn't solve BYOK at all
B: Dedicated Router VPS
- New $10-20/mo VPS for AI routing only
- All products point here
- More infra = more ops burden
- New single point of failure
- Still doesn't solve BYOK
- Right answer at $50K+ MRR
C: Vercel Serverless Proxy
- ClawdRouter logic as Next.js API routes
- Globally distributed, no server
- LiteLLM is a Python server — not serverless-compatible
- Cold starts at 2-5s per invocation
- Loses cascade intelligence
D: OpenRouter + BYOK
- Managed LLM router: 200+ models, one API
- OpenAI-compatible format
- Customer brings their own OpenRouter key
- Zero new infrastructure
- No Forge dependency
- BYOK maps perfectly
- 94%+ margins maintained
- Zero maintenance burden
Why BYOK Changes the Whole Equation
MasteryMade runs on BYOK — customers bring their own LLM API keys, paying their own compute bills. This achieves 94%+ margins. This single fact transforms the routing architecture:
Traditional SaaS
Company has one key set, pays all compute, owns routing layer. This is where shared ClawdRouter makes sense — Jason is the "customer" and cost optimization on his keys matters.
BYOK SaaS
Customer brings their own keys and pays their own compute bill. A shared routing server means the company still manages credentials — defeating BYOK's purpose. The routing must happen under each customer's credentials.
The Correct Model for MasteryMade
Customer provides OpenRouter key (or individual Anthropic/OpenAI/Google keys). MasteryOS stores the key in Credential Vault (encrypted). When a request comes in, MasteryOS fetches the key and calls the provider directly. No centralized routing server. No Jason compute cost.
The Core Insight
ClawdRouter is infrastructure for Jason's own services — where Jason is paying the API bills. For products where customers pay their own bills (BYOK), you don't need a shared router. You need a per-customer key store and direct API calls. Credential Vault IS the routing layer for BYOK products.
Failure Mode Mapping
| Failure | No Coupling (now) | Forge Port Coupling | Standalone Router | OpenRouter BYOK |
|---|---|---|---|---|
| Forge VPS down | MasteryOS Fine | MasteryOS AI Broken | MasteryOS Fine | MasteryOS Fine |
| OpenAI API outage | MasteryOS AI Broken | Cascade may catch it | Cascade may catch it | OpenRouter auto-failover |
| Customer key expired | That customer broken | Same | Same | Same — Vault can alert |
| Routing server outage | N/A | ALL customers broken | ALL products broken | N/A — OpenRouter manages its own uptime |
Adding a shared routing server creates a new single point of failure. BYOK with direct API calls is MORE resilient — failures are isolated to individual customers, not all customers simultaneously.
2nd Order Effects
The Verdict
Recommended Architecture
Two separate routing strategies, each optimized for its context:
Ralph, reconcile, MasteryBook, Content Pipeline
Customer-facing, BYOK product
Wizard, publishing, templates
Align360, Freedom Sherpa, etc.
$50K+ MRR threshold
What Changes Where
Forge Internal — No Change
ClawdRouter stays as-is. MasteryBook continues using Forge's Gemini key (it's a Forge tool, not a customer product). This is working — don't touch it.
MasteryOS — Two Changes Needed
Currently hardcoded to OpenAI Assistants v2. Two paths:
Credential Vault — Promote to Cross-Product Primitive
Now: Embedded in MasteryOS
Fernet encryption + Supabase KMS. Per-user API key storage. Works. Only accessible to MasteryOS code.
Phase 2: Extract as Standalone Service
Credential Vault gets its own API. MasteryOS calls it. Reveal calls it. Align360 calls it. One vault, all products. This is the Forge modular service pattern applied here.
Phase 3: MasteryOS as OAuth Provider
MasteryOS account = OAuth login for all other products. Vault keys travel with the user. Login once, use everywhere. The ecosystem lock-in is the convenience, not the lock.
Why OpenRouter Is the Right External Gateway
| Feature | ClawdRouter (Forge) | OpenRouter |
|---|---|---|
| Models available | ~28 configured | 200+ |
| API format | OpenAI-compatible | OpenAI-compatible |
| Fallback routing | Custom cascade logic | Built-in, configurable |
| BYOK support | No — uses Forge's keys | Yes — customer's own key |
| Uptime SLA | Forge VPS uptime | Managed, HA |
| Infrastructure cost | Included in VPS cost | $0 + per-token passthrough |
| Usage analytics | Custom logging required | Built-in dashboard |
| Jason ops burden | Owned + maintained | Zero |
The Asymmetric Insight
OpenRouter gives you the benefits of a centralized router — single API, model flexibility, fallbacks — without infrastructure cost or coupling. For BYOK products, it's strictly superior to building your own router. ClawdRouter stays valuable for Forge-internal where Jason is the "customer" and his API key cost optimization matters.
Next Actions (When Ready to Build)
Extract Credential Vault
Standalone API service. Required before BYOK can serve multiple products. Next Sumit build target.
MasteryOS OpenRouter Migration
Replace OpenAI Assistants v2 with direct completions via OpenRouter. Agent sandbox does this first.
OpenRouter Key Type in Vault
New key type. Default recommended key for MasteryOS customers. Migration script for existing users.
Cascade Config (Not Code)
Model preferences as JSON per workspace. UI toggle for model tier. No server changes per model choice.
Token Tracking Service
Works across BYOK and direct. Per-user spend tracking regardless of provider. Already on roadmap.
Standalone Router (Later)
Revisit at $50K+ MRR. Then: volume discounts, centralized analytics, A/B model testing. Not before.
One-Line Summary
ClawdRouter stays on Forge for Forge. Everything else uses the customer's own keys via OpenRouter, stored in Credential Vault. The cascade router is not the product — the cascade logic is. And logic is just config.
Published March 2026 · Command Center · Ecosystem Vision