Architecture LiteLLM MasteryOS First Principles March 2026

LiteLLM Standalone

First Principles: Should AI routing become shared infrastructure?

Driving question: if Forge VPS goes down, should MasteryOS still work?

The Core Question

Should ClawdRouter/LiteLLM become a standalone service that MasteryOS and other products call — so if Forge goes down, nothing else goes down with it?

Three questions stacked: (1) Do we extract routing into shared infra? (2) If yes, where does it live? (3) What does BYOK mean for this decision?

What ClawdRouter Actually Is

A cascade LLM router on the Forge VPS (port 8080). It wraps LiteLLM — which itself wraps multiple providers — and adds cascade logic: try Gemini 2.5 Flash first (cheap, fast), fall back to Claude Sonnet, then GPT-4o, then Groq. Cost optimization baked in. Currently Forge-internal only. MasteryOS has zero connection to this chain.

Current State: Forge VPS Internal Only

Forge Service
ClawdRouter :8080
LiteLLM
Gemini / Claude / GPT-4 / Groq

MasteryOS on AWS EC2 — no connection to Forge routing

The Four Options

Rejected

A: Open Forge Port to MasteryOS

  • Add MasteryOS IP to ClawdRouter allowlist
  • MasteryOS calls Forge VPS directly

  • Forge down = MasteryOS AI down
  • Hard SLA coupling on a revenue product
  • Doesn't solve BYOK at all
Rejected for now

B: Dedicated Router VPS

  • New $10-20/mo VPS for AI routing only
  • All products point here

  • More infra = more ops burden
  • New single point of failure
  • Still doesn't solve BYOK
  • Right answer at $50K+ MRR
Rejected

C: Vercel Serverless Proxy

  • ClawdRouter logic as Next.js API routes
  • Globally distributed, no server

  • LiteLLM is a Python server — not serverless-compatible
  • Cold starts at 2-5s per invocation
  • Loses cascade intelligence

Why BYOK Changes the Whole Equation

MasteryMade runs on BYOK — customers bring their own LLM API keys, paying their own compute bills. This achieves 94%+ margins. This single fact transforms the routing architecture:

1

Traditional SaaS

Company has one key set, pays all compute, owns routing layer. This is where shared ClawdRouter makes sense — Jason is the "customer" and cost optimization on his keys matters.

2

BYOK SaaS

Customer brings their own keys and pays their own compute bill. A shared routing server means the company still manages credentials — defeating BYOK's purpose. The routing must happen under each customer's credentials.

3

The Correct Model for MasteryMade

Customer provides OpenRouter key (or individual Anthropic/OpenAI/Google keys). MasteryOS stores the key in Credential Vault (encrypted). When a request comes in, MasteryOS fetches the key and calls the provider directly. No centralized routing server. No Jason compute cost.

The Core Insight

ClawdRouter is infrastructure for Jason's own services — where Jason is paying the API bills. For products where customers pay their own bills (BYOK), you don't need a shared router. You need a per-customer key store and direct API calls. Credential Vault IS the routing layer for BYOK products.

Failure Mode Mapping

FailureNo Coupling (now)Forge Port CouplingStandalone RouterOpenRouter BYOK
Forge VPS down MasteryOS Fine MasteryOS AI Broken MasteryOS Fine MasteryOS Fine
OpenAI API outage MasteryOS AI Broken Cascade may catch it Cascade may catch it OpenRouter auto-failover
Customer key expired That customer broken Same Same Same — Vault can alert
Routing server outage N/A ALL customers broken ALL products broken N/A — OpenRouter manages its own uptime

Adding a shared routing server creates a new single point of failure. BYOK with direct API calls is MORE resilient — failures are isolated to individual customers, not all customers simultaneously.

2nd Order Effects

Decision
1st Order
2nd Order
Shared ClawdRouter for all products
Single deployment manages all routing.
New SPF. Every product's SLA coupled to Forge ops. One breach exposes all products' API keys.
BYOK + OpenRouter per product
Customer pays their own AI bill. Zero COGS to Jason.
Credential Vault becomes critical cross-product infrastructure. Products can add new models without infra changes. Cascade logic is config, not code — update without deploys.
Keep ClawdRouter Forge-only
Forge internal services use optimized routing.
Forge COGS stay low. Each product's AI cost is transparent and attributable. No cross-contamination between products.
Credential Vault as cross-product primitive
One vault, all products. Customer manages one set of keys.
Vault becomes the network effect — customers with keys in vault are in the ecosystem. Every new product (Align360, Brad Himel) plugs in. SSO path opens.
OpenRouter as customer gateway
Customer: one key, 200+ models, built-in fallbacks.
Jason can offer model switching as a UI feature without infra changes. "Try Claude vs GPT-4 vs Gemini on the same workflow" = toggle. Resilience for free.

The Verdict

Recommended Architecture

Two separate routing strategies, each optimized for its context:

Forge Internal
Ralph, reconcile, MasteryBook, Content Pipeline
Keep ClawdRouter on Forge VPS. Jason pays the API bills. Cascade optimization matters — Forge runs hundreds of AI calls/day. ClawdRouter earns its keep here.
Keep ClawdRouter
MasteryOS / Athio
Customer-facing, BYOK product
BYOK via Credential Vault. Customer brings OpenRouter key. MasteryOS fetches key from Vault, calls provider directly. No Forge coupling. No Jason compute cost. Cascade logic as JSON config.
BYOK Direct
Reveal / NowPage
Wizard, publishing, templates
Same as MasteryOS. BYOK. Customer's OpenRouter key stored in Credential Vault. AI features run on customer's billing. Clean margin maintained.
BYOK Direct
JV Partners
Align360, Freedom Sherpa, etc.
Partner brings their own key OR subscribes to a managed key tier (Jason negotiates volume pricing, passes through at small margin). This becomes a revenue line, not a cost center.
BYOK or Managed
Future Scale
$50K+ MRR threshold
Deploy standalone LiteLLM on dedicated server. Unlocks: usage analytics across all customers, volume discounts with providers, A/B model testing. The economics justify the infra then, not before.
Standalone Later

What Changes Where

Forge Internal — No Change

ClawdRouter stays as-is. MasteryBook continues using Forge's Gemini key (it's a Forge tool, not a customer product). This is working — don't touch it.

MasteryOS — Two Changes Needed

Currently hardcoded to OpenAI Assistants v2. Two paths:

# Option 1: Keep Assistants, add BYOK key lookup api_key = credential_vault.get_key(user_id, "openai") client = OpenAI(api_key=api_key) # Option 2 (RECOMMENDED): Migrate to direct completions via OpenRouter # More model flexibility, simpler code, BYOK works natively api_key = credential_vault.get_key(user_id, "openrouter") client = OpenAI( api_key=api_key, base_url="https://openrouter.ai/api/v1" ) # Customer can now use any model via config: # "anthropic/claude-sonnet-4-6", "google/gemini-2.5-flash", "openai/gpt-4o"

Credential Vault — Promote to Cross-Product Primitive

1

Now: Embedded in MasteryOS

Fernet encryption + Supabase KMS. Per-user API key storage. Works. Only accessible to MasteryOS code.

2

Phase 2: Extract as Standalone Service

Credential Vault gets its own API. MasteryOS calls it. Reveal calls it. Align360 calls it. One vault, all products. This is the Forge modular service pattern applied here.

3

Phase 3: MasteryOS as OAuth Provider

MasteryOS account = OAuth login for all other products. Vault keys travel with the user. Login once, use everywhere. The ecosystem lock-in is the convenience, not the lock.

Why OpenRouter Is the Right External Gateway

FeatureClawdRouter (Forge)OpenRouter
Models available~28 configured200+
API formatOpenAI-compatibleOpenAI-compatible
Fallback routingCustom cascade logicBuilt-in, configurable
BYOK supportNo — uses Forge's keysYes — customer's own key
Uptime SLAForge VPS uptimeManaged, HA
Infrastructure costIncluded in VPS cost$0 + per-token passthrough
Usage analyticsCustom logging requiredBuilt-in dashboard
Jason ops burdenOwned + maintainedZero

The Asymmetric Insight

OpenRouter gives you the benefits of a centralized router — single API, model flexibility, fallbacks — without infrastructure cost or coupling. For BYOK products, it's strictly superior to building your own router. ClawdRouter stays valuable for Forge-internal where Jason is the "customer" and his API key cost optimization matters.

Next Actions (When Ready to Build)

1

Extract Credential Vault

Standalone API service. Required before BYOK can serve multiple products. Next Sumit build target.

2

MasteryOS OpenRouter Migration

Replace OpenAI Assistants v2 with direct completions via OpenRouter. Agent sandbox does this first.

3

OpenRouter Key Type in Vault

New key type. Default recommended key for MasteryOS customers. Migration script for existing users.

4

Cascade Config (Not Code)

Model preferences as JSON per workspace. UI toggle for model tier. No server changes per model choice.

5

Token Tracking Service

Works across BYOK and direct. Per-user spend tracking regardless of provider. Already on roadmap.

6

Standalone Router (Later)

Revisit at $50K+ MRR. Then: volume discounts, centralized analytics, A/B model testing. Not before.

One-Line Summary

Don't centralize LLM routing. BYOK + OpenRouter + Credential Vault = zero infrastructure coupling, zero compute cost, infinite model flexibility.

ClawdRouter stays on Forge for Forge. Everything else uses the customer's own keys via OpenRouter, stored in Credential Vault. The cascade router is not the product — the cascade logic is. And logic is just config.

Published March 2026 · Command Center · Ecosystem Vision