Sterling Replication Guide

Section 1

What Is Sterling?

Sterling is five layers working together. Each layer serves a specific role, and replication depends on which layers you need.

🧠

Claude Code — The Brain

AI reasoning engine, CLI tool running Claude models

Claude Opus / Sonnet claude -p headless mode Tool use & planning Multi-step reasoning

⚙️

Cyndra — The Body

Docker container, Telegram bridge, IPC messaging, task scheduler

Docker Container Telegram Bot Bridge IPC Messaging Cron Scheduler Background Task Runner Host Exec (Mac Mini)

🛠️

MCP Servers — The Hands

Model Context Protocol tools connecting Sterling to external platforms

Google Workspace Monday.com HubSpot Browser Agent Metricool VidIQ LinkedIn WhatsApp Stripe Cloudflare 40+ more

🗃️

Persistent Storage — The Memory

Files that survive between sessions and after context window compaction

CLAUDE.md (protocols) Skill Files Session Notes Filing Cabinet Index Client Intelligence Files Google Docs (lockstep)

🤝

Coworker Agents — The Team

Background sub-agents Sterling spawns for parallel work

Background Research GChat Scans Content Builds Client File Enrichment Parallel Task Execution

Section 2

Billing Paths

Four ways to authenticate and pay for Claude agent usage. Each comes with different capabilities.

Feature	Anthropic Max (Personal)	Anthropic Team Plan	API (Pay-as-you-go)	Google Vertex AI
Cost	$200/mo (Max 20x)	$25-125/seat/mo (5 seat min)	Per-token ($3-15/MTok Sonnet, $5-25/MTok Opus)	Per-token (GCP pricing)
Auth method	OAuth (browser login)	OAuth (team login)	API key (ANTHROPIC_API_KEY)	GCP service account
Claude Code support	✓ Full	✓ Full	✓ (loses some features)	✗ No Claude Code
`claude -p` headless	✓	✓	✓ (--bare mode)	✗
setup-token for agents	✓	✓	N/A	N/A
MCP tools	✓ Native	✓ Native	✓ Via Agent SDK	✗ Must rebuild
Coworker agents	✓	✓	✗	✗
Persistent memory	✓ (CLAUDE.md, files)	✓ (CLAUDE.md, files)	✗ Must build	✗ Must build
Usage pool	Shared with claude.ai (separate headless pool post June 15)	Independent per seat	Pay per token, no limits	Pay per token, no limits
Telegram interface	Via Cyndra	Via Cyndra	Custom build needed	Custom build needed
Best for	Brad's personal Sterling	Dedicated agent seats with isolated pools	High-volume / scale deployments	Google ecosystem integration

← Scroll horizontally to see all columns →

Section 3

Replication Paths

Four approaches to building an agent like Sterling, each with different trade-offs in capability, cost, and deployment time.

Recommended

Cyndra Clone

How: Clone Cyndra container template, new Telegram bot token, fresh CLAUDE.md
Auth: OAuth via Max or Team seat
Billing: Max subscription or Team seat (included in monthly plan)
Deploy time: ~30 minutes
What you get: Full Sterling experience: Telegram interface, MCP tools, persistent memory, scheduled tasks, coworker agents

Trade-off: Shares usage pool (Max) or needs a dedicated seat (Team)

Best for replicating Sterling for Brad or specific OA team members

Agent SDK + Custom Bot

How: Build with Anthropic Agent SDK (TypeScript/Python) + Telegram bot library
Auth: API key only (ANTHROPIC_API_KEY)
Billing: Pay-as-you-go API credits (cannot use Max/Team subscription)
Deploy time: 2-3 hours (Sterling build time)
What you get: Custom agent with tool use, flexible deployment

Trade-off: No Claude Code features (MCP, coworkers, persistent memory must be rebuilt). Pay-per-token billing.

Best for productized agents for customers, Maximize My VA subscribers

Claude via Google Vertex AI

How: Use Claude models through Vertex AI API, build agent on Google ADK or Agent SDK
Auth: GCP service account / Vertex AI credentials
Billing: GCP billing (per-token, billed through Google)
Deploy time: 4-6 hours
What you get: Google enterprise security, monitoring, Cloud Run hosting, Google ecosystem integration

Trade-off: No Claude Code at all. Lose MCP, coworkers, persistent memory, Telegram bridge. Must rebuild everything. Gemini Agent Builder is separate.

Best for enterprise deployments needing Google security/compliance wrapper

Gemini Agent Builder (Google ADK)

How: Use Google's Agent Development Kit in Vertex AI Agent Builder
Auth: GCP / Google Workspace Enterprise
Billing: GCP consumption (included in some Enterprise tiers, or per-use)
Deploy time: Varies
What you get: Native Google integration, data store connectors (Gmail, Drive, Calendar already connected for OA), enterprise admin controls

Trade-off: Uses Gemini models (not Claude brain). Different capability profile. Less coding ability, stronger Google integration.

Best for Google-native workflows, internal OA tools needing Google data access

Section 4

Decision Tree

Answer three questions to find the right path. Start with who the agent is for.

Who is this agent for?

Brad personally

Path A — Cyndra clone under Max

OA team member

Path A — Cyndra clone under Team seat

OA client

Path B (Agent SDK) or D (Gemini if Google-native)

Maximize My VA subscriber

Path B — DIY Agent SDK build

Enterprise / compliance

Path C (Vertex) or D (Gemini)

What billing model works?

Stay under existing monthly plans

Path A — Max or Team OAuth

Per-use / scalable / no cap

Path B or C — API credits

Google billing preferred

Path C or D — GCP consumption

Need Sterling's full capabilities?

Yes — MCP, memory, Telegram, coworkers, scheduled tasks

Path A only

Partial — AI + chat interface is enough

Path B, C, or D

Google-native tools primarily

Path D — Gemini Agent Builder

Sterling Replication Guide Agent Architecture & Billing Comparison