Security-first on unsecured WiFi. Claude Max stays default. LiteLLM already running — just needs wiring. ~2.5 hrs total.
On unsecured WiFi, even HTTPS leaks metadata — the network can see you're connecting to vercel.app, claude.asapai.net, etc. (DNS is often unencrypted, and the destination IP is visible even with TLS). Tailscale uses WireGuard: all traffic from your phone is encrypted at the packet level before it hits the WiFi network. The café sees only encrypted UDP packets going to your VPS's Tailscale IP. Nothing else.
The fix: make the VPS a Tailscale exit node. Your phone routes ALL internet traffic through the VPS via Tailscale tunnel. You then access the Vercel dashboard, claude.asapai.net, everything — as if you were on the VPS. No local network exposure.
When you access the VPS via its Tailscale IP directly, you hit whatever nginx or Tailscale serve has configured — likely a single-page app or a specific port. The full dashboard needs the exit node approach OR a proper serve config. Check current state first:
tailscale serve status
On the VPS — advertise it as a Tailscale exit node:
tailscale up --advertise-exit-node
This adds the VPS as an available exit node in your Tailnet. It doesn't change anything yet — your devices still use direct routes until you enable it on each device.
Approve the exit node in the Tailscale admin console (one-time, takes 30 seconds):
Go to login.tailscale.com/admin/machines → find your VPS → click the ⋯ menu → Edit route settings → enable Use as exit node → Save.
On your phone (when on unsecured WiFi): Open Tailscale app → tap the VPS in the peer list → toggle "Use as exit node".
Now ALL your phone's internet traffic routes through the VPS. Open forge-dashboard-kbce.vercel.app in browser — it loads via the VPS, not your phone's WiFi.
Turn off when on trusted WiFi/cell to avoid routing all traffic through VPS unnecessarily.
On your laptop (same approach): Tailscale → select VPS as exit node. Everything is secure.
If you want the dashboard to live only inside Tailscale (never accessible on public internet), self-host the Next.js app on the VPS and serve it via Tailscale. More work but fully air-gapped from public internet.
10 minutes. All traffic secured. Full dashboard works unchanged. No app changes.
Tradeoff: all traffic routes through VPS (minor latency). Turn off on trusted networks.
Build Next.js on VPS, configure tailscale serve to proxy port 3000. Dashboard lives entirely within Tailscale network.
~1 hr more work. Dashboard URL becomes your Tailscale machine name. Requires keeping the VPS build current with Vercel.
VPS appears as available exit node in Tailscale admin. Phone can toggle it on. With exit node active, visiting forge-dashboard-kbce.vercel.app shows full dashboard. whatismyip.com shows VPS IP (not phone's WiFi IP) — confirms all traffic routing through VPS.
The model selector defaults to Sonnet 4.6 (Max). The dropdown shows all available models grouped by billing type. You manually switch when Max is unavailable. No auto-switching, no surprises. The forge-claude-model localStorage key persists your last selection — so if you deliberately switch to a fallback, it stays until you switch back.
| Name in Dropdown | Billing | Notes |
|---|---|---|
| Sonnet 4.6 DEFAULT | MAX | Current default. Full tool use. Uses subscription. |
| Opus 4.6 | MAX | Heaviest Max model. Use for complex reasoning. |
| Haiku 4.5 | MAX | Fast + cheap. Good for simple tasks on Max. |
| Sonnet 4.5 (API) | API KEY | Identical to Max Sonnet. Pay-per-token. Full tool use. Best fallback. |
| Haiku 4.5 (API) | API KEY | ~$0.01/session. Full tool use. Use when Max is down + cost matters. |
| Gemini 2.0 Flash | OPENROUTER | Fast. Good for analysis + image gen. Tool use may degrade. |
| Gemini 2.5 Pro | OPENROUTER | Strong reasoning. Good code review. Tool use may degrade. |
| DeepSeek V3 | OPENROUTER | Excellent coder. Very cheap. Tool use may degrade. |
| Qwen 2.5 Coder 32B | OPENROUTER | Strong on code. Cheap via OpenRouter. |
| Qwen 7B (Local) | FREE/LOCAL | Runs on VPS via Ollama. Free. No internet needed. Tool use unreliable. |
| Llama 3.2 3B (Local) | FREE/LOCAL | PII-safe (never leaves VPS). Tiny. Fast. Basic tasks only. |
Sonnet/Haiku via Anthropic API key (not Max) behave identically to Max — full file editing, bash, tool use, agentic tasks. OpenRouter and local models receive translated requests via LiteLLM but may garble tool calls. Best use for non-Claude models: chat, analysis, code review. Don't expect complex multi-file edits. Gemini is the best non-Claude option for tool use.
# Claude Max (default — current behavior)
claude --model sonnet --dangerously-skip-permissions
# Anthropic API (pay-per-token, full tool use, identical behavior)
ANTHROPIC_API_KEY=<key> claude --model claude-sonnet-4-5-20250929 --dangerously-skip-permissions
# LiteLLM → OpenRouter / Local (experimental tool use)
ANTHROPIC_BASE_URL=http://127.0.0.1:4000 \
ANTHROPIC_API_KEY=sk-forge-litellm-local \
claude --model gemini-2-flash --dangerously-skip-permissions
All three run in the same tmux window. The dashboard restart endpoint just sends different env vars + model flag. State impact = same as clicking Restart today.
dashboard.ts — add GET /api/models endpoint returning full model list with billing type metadata.
dashboard.ts — update POST /api/claude-restart to accept all model IDs and launch with correct env vars per billing type (Max / API key / LiteLLM).
web-terminal.tsx — replace 3 hardcoded <option> tags with dynamic fetch from /api/vps/models, rendered as <optgroup> sections by billing type. Default selection = sonnet.
/opt/forge/config/litellm.yaml — add Gemini 2.0 Flash and Gemini 2.5 Pro via OpenRouter. Restart LiteLLM service.
Model selector shows grouped optgroups: Claude Max / Anthropic API / OpenRouter / Local. Default is Sonnet 4.6. Selecting "Haiku 4.5 (API)" and restarting uses Anthropic API key with full tool use. Selecting DeepSeek V3 and restarting connects via LiteLLM. Both work in the same tmux terminal window.
MasteryBook points at Gemini directly (bad key or Gemini CLI not installed). Fix: redirect to LiteLLM → gemini-2-flash via OpenRouter. Same model, no new key needed, uses OpenRouter key already in place. MasteryBook never breaks on Gemini API changes again — LiteLLM handles fallbacks.
Find MasteryBook + diagnose: find /opt/forge -name "*mastery*" 2>/dev/null | grep -v node_modules
In MasteryBook config/env, change to LiteLLM endpoint:
OPENAI_BASE_URL=http://localhost:4000/v1
OPENAI_API_KEY=sk-forge-litellm-local
MODEL=gemini-2-flash
If MasteryBook uses a Gemini-specific SDK, replace with openai-compatible client pointed at LiteLLM.
MasteryBook generates content without errors. Logs show requests hitting LiteLLM → OpenRouter → Gemini successfully.
Once gemini-2-flash is in LiteLLM (Item 2, Step 4), image generation is selecting that model and asking for an image. Add gemini-2-flash-image pointing to openrouter/google/gemini-2.0-flash-exp:free (the experimental variant with image gen). Select it in the terminal model dropdown, ask Claude to generate an image — response includes base64 PNG.
Bonus: add /image [prompt] slash command to Forge terminals that calls this model directly, saves to /tmp/forge-image.png, and optionally publishes to a NowPage.
Selecting Gemini image model in terminal and typing "generate an image of X" returns a usable PNG. Works on vacation via Tailscale exit node.
~2.5 hrs total. Items 1+2 alone = vacation-safe + secure.
Item 1 is 20 minutes and makes everything else work securely from anywhere.