ogcode prasenjeet-symon
winget install --id=prasenjeet-symon.ogcode -e ogcode is an agentic coding assistant with a web UI that helps you write, review, and manage code.
winget install --id=prasenjeet-symon.ogcode -e ogcode is an agentic coding assistant with a web UI that helps you write, review, and manage code.
The token-efficient agentic coding workbench.
Built for a future where every token counts. Ogcode curates the relevant context for each turn — not the full transcript — so it cuts 70%+ of tokens on long sessions, sharpens accuracy, and lets even lower-end models outperform frontier ones. And because it recalls instead of replays, your conversations run effectively forever — you never hit a model's context limit, on any model, frontier or local. All while planning with you, remembering your codebase, and shipping features in parallel from a single binary that never leaves your machine.

Context Engineering · Infinite Context · Plan Mode & Parallel PRs · Quick Start · Why Ogcode · Documentation · Discord
> Every other coding agent resends your entire conversation history on every turn. Ogcode doesn't.
Most coding agents operate on a naive replay loop: each turn, they bundle up the full transcript so far — every prior message, every tool result, every tangent — and ship it back to the model. That has two costs, and only one of them is money.
1. It burns tokens. The prompt grows linearly with the session, so a 200-message task can cost 5× more than a 20-message one even if the new work is trivial. On a fixed monthly budget this caps how much you can actually ship.
2. It hurts accuracy — and this matters more than the money. An LLM can only act on what's in its context window. When you flood that window with stale, unrelated chatter from earlier in the session, the signal gets buried in noise: the model loses sight of the current task, drifts toward half-remembered earlier decisions, and reasons against context that was relevant then but isn't relevant now. The older the conversation, the more the historical turns actively distract from the turn in front of the model.
Ogcode does the opposite. For each turn it extracts only the context that is actually relevant to the task at hand — pulling precise facts from a persistent knowledge graph and call graph via memory_recall, fetching code-structure context on demand, and compacting stale history instead of replaying it verbatim. The model receives a short, sharp, on-point context window. Less history, fewer tokens — and better outcomes, because the model isn't wading through a hundred old messages to find the three facts it needs right now.
Saving tokens isn't only about cost — it's about accuracy. A smaller, more relevant context window lets the model focus, so it produces more correct, more on-target results per turn. The two goals reinforce each other.
No other coding agent on the market does this. Claude Code, Cursor, Copilot, Aider — every one of them replays the full conversation every turn. Token efficiency, in those tools, is an afterthought at best. Ogcode is the only agent engineered, at the agent-loop level, to conserve tokens and to curate context per turn — because it believes the real lever is context engineering: how efficiently and how relevantly you prepare the context for a given task.
This is the deeper payoff. If the only thing context engineering did was save money, it would still be worth it — but it does more: it extracts capability even from lower-end models. Keep the context relevant, limited, and short, and a mid-tier model (Claude Sonnet, a local Llama, a smaller GPT) can reason just as clearly — and sometimes outperform — a frontier model that's been handed a bloated, noisy transcript. The frontier model isn't smarter about your code; it just has more raw capacity to dig itself out of the irrelevant history you buried it under. Give either model a clean, on-point context window and the gap narrows dramatically — often to zero.
In the end, it's all about context engineering. Ogcode is brilliant at this, which is why it simultaneously cuts token cost and increases the accuracy of the task outcome. Cheaper and better — not a tradeoff.
> Every model has a context window. Every other agent eventually slams into it. With Ogcode you never do — chat forever, on any model, no matter how small its window.
This is the part that genuinely changes the game. Every LLM ships with a fixed context limit — 8K, 128K, 200K, a million tokens — and every other coding agent creeps toward that wall as the session grows, until the model starts dropping the start of the conversation or simply refuses to continue. Ogcode removes the ceiling entirely.
The reason is Agentic Session Memory. Because Ogcode recalls the few facts relevant to the current turn from a persistent knowledge graph — instead of replaying the entire transcript — the prompt it sends stays flat no matter how long the conversation runs. A session that's 50 messages deep and one that's 5,000 messages deep hand the model the same compact, on-point context window. The conversation is unbounded; the per-turn context is not. So you can talk to a model forever and never reach its limit — and that holds for any model, whatever the size of its native window.
Never hit a context limit, spend far fewer tokens, and get more accurate results — on whatever model you choose. That is the true beauty of Agentic Session Memory. (For the knowledge-graph internals, see Agentic Session Memory.)
> Other agents suggest code. Ogcode plans the feature, decomposes it, executes the pieces in parallel, and raises the pull requests for you.
Most coding agents stop at "here's the code for the file you asked about." Ogcode's Plan Mode turns a one-line goal into a shipping feature. You describe what you want to build; the planning agent reads your codebase, discusses the approach with you, and — once you lock the plan — it becomes Ogcode's responsibility to break that feature into smaller, implementation-ready tasks, run them, and open the pull requests.
┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ 1. Describe │ → │ 2. Lock │ → │ 3. Review │ → │ 4. Execute │
│ your goal │ │ the plan │ │ Kanban board │ │ in parallel │
└──────────────┘ └──────────────┘ └──────────────┘ └──────┬───────┘
│
┌──────────────┐ ┌──────────────┐ ┌───────▼──────┐
│ 6. Retry │ ← │ 5. Complete │ ← │ Task runs │
│ if needed │ │ auto-PR │ │ in isolated │
└──────────────┘ └──────────────┘ │ git branch │
└──────────────┘
The breakdown is deliberate about parallelism and conflict-free merges — this is the part most agents get wrong and Ogcode gets right:
The payoff is end-to-end parallel feature completion: you plan one feature, and Ogcode ships it as a set of clean, reviewable pull requests that don't fight each other. It does this because git is baked into the Ogcode core, not bolted on as an afterthought:
git worktree — a real, independent working checkout on its own branch. Multiple agents work simultaneously without clobbering each other's files or your main branch.origin, and opens a pull request via the gh CLI — idempotently (re-running won't create duplicate PRs), and with a generated PR body.In short: you describe the feature, Ogcode plans it, splits it into parallel-safe tasks, executes them across isolated branches, and opens the PRs — directly to your upstream GitHub repo, with no merge conflicts and no manual git wrangling. That's parallel task completion and parallel feature completion, from a single binary that manages the GitHub repo and its git worktrees natively because git is part of the Ogcode core.
Ogcode is an agentic coding assistant that runs entirely on your machine — a single Go binary with an embedded SolidJS web UI. It doesn't just suggest code. It understands your whole codebase, plans complex features with you, and executes them across parallel git branches — while your code stays local and private.
Unlike IDE-locked assistants (Cursor, Copilot) or cloud-only services (Claude Code), Ogcode is browser-native, self-hosted, and model-agnostic. Use Claude, GPT, OpenRouter, or local Ollama models, and switch anytime from the UI. No subscriptions. No vendor lock-in. Nothing leaves your machine except the prompts you send to your chosen provider.
curl -fsSL http://ogcode.xyz/install.sh | sh && ogcode
| Ogcode | Cursor | Claude Code | Copilot | Aider | |
|---|---|---|---|---|---|
| Interface | Web UI (any editor) | VS Code fork | Terminal | IDE extension | Terminal |
| Self-Hosted | Single binary, zero deps | Cloud-required | Cloud-only | Cloud-only | Open source |
| Parallel Tasks | Git worktrees, conflict-free auto-PRs to upstream | Cloud agents | Subagents | Single agent | Sequential |
| Plan Mode | Kanban + effort estimates | Agents window | Architect mode | Prompt-based | /architect |
| Persistent Memory | Knowledge graph + Call graph | Session-only | CLAUDE.md | None | None |
| Token Efficiency | Loop-level optimization, ~70% saved, higher accuracy | No | No | No | No |
| Model Choice | Claude, GPT, OpenRouter, Ollama | Built-in + custom | Claude only | MS-managed | Any endpoint |
| Cost | BYOK (tokens only) | $20–$40/mo | $20–$100/mo | $19–$39/mo | Free (BYOK) |
| License | MIT | Proprietary | Proprietary | Proprietary | Apache-2.0 |
Ogcode is the only agentic coding assistant that combines a browser-native UI (works with Vim, Emacs, VS Code, JetBrains, or any editor), a formal Plan Mode with a visual Kanban board and parallel execution that raises conflict-free PRs directly against your upstream GitHub repo, git-native parallel execution that gives every task its own isolated worktree branch with auto-commits and auto-PRs, a persistent knowledge graph for long-term memory, loop-level token optimization that keeps long-session token use over 70% lower than naive replay and sharpens per-turn accuracy, context engineering that lets lower-end models match or beat frontier models on clean context, and single-binary self-hosting with zero cloud dependencies.
deep_search tool searches the web, fetches pages, and synthesizes cited research for your agent.# macOS / Linux — one-line install
curl -fsSL http://ogcode.xyz/install.sh | sh
# Set your API key (or use Ollama for local models)
export ANTHROPIC_API_KEY=sk-ant-...
# Start coding
ogcode
Opens at http://localhost:9595. That's it — no config files, no Docker, no IDE extension.
Cursor is great, but it's a fork of VS Code. If you use Vim, Emacs, or JetBrains, you're out of luck — and its parallel agents run in the cloud, not on your machine.
Claude Code is powerful, but it's terminal-only, Anthropic-only, and cloud-only. No web UI, no plan mode, no persistent knowledge graph.
GitHub Copilot is everywhere, but it's a Microsoft service. Your code analysis happens in the cloud, with no parallel execution and no formal planning.
Aider is excellent, but it's terminal-only and sequential, with no persistent memory graph or visual planning board.
Ogcode gives you what none of the above do:
> Nobody else is thinking about token optimization. They burn tokens like water — Claude Code, OpenCode, GitHub Copilot, every coding agent out there — because none of them are designed, at the core loop level, to actually conserve tokens. And because they resend the whole transcript every turn, they also hand the model a noisier, less accurate context window.
As the cost of using frontier AI climbs with every intelligence leap, tokens are becoming a budgeted resource. In the near future, a team — or a solo developer — will have a fixed monthly token allowance and have to ship software, features, and fixes within it. Ogcode is built for that future: token efficiency is designed into the agent loop itself, not bolted on after the fact — and it doubles as an accuracy win, because the model reasons over a curated, on-point context window instead of a wall of stale chat.
| Mechanism | What it does | Token impact |
|---|---|---|
| Agentic Session Memory | Replaces "send the whole conversation every turn" with a knowledge graph that returns only the facts relevant to the current query. | Largest single saving — grows with session length |
| Call Graph recall | Pulls in code-structure context on demand instead of re-reading source files into the prompt. | Avoids re-sending large file contents |
| Context compaction | Summarizes stale history instead of replaying it verbatim, with truncation as a fallback. | Caps prompt size on long sessions |
Targeted memory_recall | The agent retrieves precise historical facts (config values, past decisions) rather than re-deriving them by re-reading code. | Fewer exploration turns |
In real session testing, Ogcode saves over 70% of tokens on long-running sessions versus a naive full-replay loop — meaning a fixed monthly budget goes further, a team stays under its limit, and frontier-model cost increases hurt less. And because the model sees only the relevant facts for the current turn rather than the entire transcript, task accuracy improves at the same time: less drift, fewer half-remembered earlier decisions, more on-target results.
| Session Length | Traditional (full replay) | With Ogcode memory | Savings |
|---|---|---|---|
| 50 messages | ~25K tokens | ~8K tokens | 68% |
| 200 messages | ~100K tokens | ~28K tokens | 72% |
| 1000 messages | ~500K tokens | ~120K tokens | 76% |
Enable it with:
export OGCODE_AGENTIC_MEMORY_MODE=true
See Agentic Session Memory for the technical deep dive.
| Platform | Minimum |
|---|---|
| Operating System | macOS, Linux, or Windows |
| Go | 1.22+ (for go install only) |
| Git | 2.34+ (required for worktree support) |
| CPU | Any modern x86_64 or arm64 processor |
| Memory | 512 MB free RAM |
An LLM API key or a local Ollama installation is required. See Configuration.
Via Homebrew (recommended):
brew tap prasenjeet-symon/ogcode
brew install ogcode
Via curl (one-liner):
curl -fsSL http://ogcode.xyz/install.sh | sh
Auto-detects your platform, downloads the latest release, and installs to /usr/local/bin.
irm http://ogcode.xyz/install.ps1 | iex
Downloads the latest release, extracts to %LOCALAPPDATA%\ogcode, and adds it to your PATH.
Via winget:
winget install prasenjeet-symon.ogcode
go install github.com/prasenjeet-symon/ogcode@latest
docker run -p 9595:9595 -v $(pwd):/workspace -w /workspace ghcr.io/prasenjeet-symon/ogcode:latest
Ogcode auto-detects available AI providers from environment variables. No config files required.
Set at least one API key (or use Ollama):
| Variable | Provider |
|---|---|
ANTHROPIC_API_KEY | Anthropic (Claude) |
OPENAI_API_KEY | OpenAI (GPT) |
OPENROUTER_API_KEY | OpenRouter |
OLLAMA_BASE_URL | Ollama (local / cloud URL) |
# macOS / Linux — auto-detected if ollama is installed
ollama serve
ogcode
# Or explicit on any OS:
export OLLAMA_BASE_URL=http://localhost:11434/v1
ogcode
Available models: qwen3, codellama, llama3.1, deepseek-coder-v2, mistral, and any model you've pulled.
Enable infinite-context memory across sessions:
export OGCODE_AGENTIC_MEMORY_MODE=true
Give your agent the ability to research current documentation:
export OGCODE_SEARCH_ENABLED=true
ogcode
Chat with the agent — ask it to read files, write code, run commands, or search the codebase.
ogcode plan
Describe what you want to build. The planning agent reads your codebase, discusses the approach, and breaks it into tasks with dependencies and effort estimates. Lock the plan, and the tasks become a Kanban board you can execute in parallel. Completed plans are archived as markdown in .ogcode/archives/.
ogcode -p 3000
ogcode plan -p 3000
Ogcode is just an HTTP server — host it on a remote machine and reach it from any browser.
docker run -p 9595:9595 \
-v ~/.ogcode:/root/.ogcode \
-v $(pwd):/workspace -w /workspace \
ghcr.io/prasenjeet-symon/ogcode:latest
Then open http://:9595 from any browser.
# nginx config
server {
listen 443 ssl;
server_name ogcode.yourdomain.com;
location / {
proxy_pass http://127.0.0.1:9595;
proxy_set_header Upgrade $http_upgrade; # WebSocket support
proxy_set_header Connection "upgrade";
}
}
Access via https://ogcode.yourdomain.com — encrypted and clean.
services:
ogcode:
image: ghcr.io/prasenjeet-symon/ogcode:latest
volumes:
- ~/.ogcode:/root/.ogcode
ports:
- "127.0.0.1:9595:9595" # only localhost — nginx handles public access
restart: unless-stopped
Ogcode is a coding agent that can execute shell commands, read and write files, and modify your system. Never expose it to the public internet without authentication.
| Risk | Mitigation |
|---|---|
| Anyone can hit port 9595 | Bind to 127.0.0.1 + use a reverse proxy |
| No auth on the web UI | Add HTTP Basic Auth in nginx, or use a VPN |
| Full shell access via the agent | Run in a restricted environment (Docker, VM) |
Recommended approaches:
# On your laptop:
ssh -L 9595:localhost:9595 user@your-server
# Then open http://localhost:9595 in your browser
htpasswd -c /etc/nginx/.htpasswd your_username
Agents can produce rich visual content directly in the chat — not just plain text:
| Format | Syntax | Use For |
|---|---|---|
| Mermaid diagrams | ```mermaid | Flows, architectures, sequences, ER diagrams |
| LaTeX math | $...$ or $$...$$ | Mathematical formulas and equations |
| Plotly charts | ```plotly | Bar, line, scatter, pie, heatmap, and more |
| Rough diagrams | ```rough | Hand-drawn style 2D diagrams |
| HTML/CSS/JS | ```html | Interactive dashboards, styled tables, animations |
HTML blocks render in a sandboxed iframe — scripts run in isolation with no access to the parent page. The agent is given your viewport dimensions so it can design responsive content that fits your screen.

Traditional assistants send the entire conversation history to the LLM every turn — expensive, and quick to hit token limits. Ogcode's Agentic Session Memory extracts, stores, and retrieves only the context relevant to each query.
| Benefit | Impact |
|---|---|
| ~70% token savings | Drastically reduced API costs on long sessions |
| Infinite context | No practical limit on session length or codebase size |
| Higher accuracy | Only relevant memories are retrieved per query |
Ogcode maintains a persistent Topic → Concept → Fact hierarchy with vector embeddings, plus a function-level Call Graph for codebase navigation. This knowledge graph survives across sessions, so your agent remembers your codebase structure, your conventions, and your past decisions.
Topic: "Ogcode Authentication"
└─ Concept: "JWT Middleware"
└─ Fact: "Token validation lives in internal/auth/jwt.go:47"
└─ Fact: "Refresh tokens expire after 7 days (config: AUTH_REFRESH_TTL)"
└─ Concept: "OAuth Flow"
└─ Fact: "GitHub OAuth uses PKCE, implemented in internal/auth/oauth.go"
Enable it with:
export OGCODE_AGENTIC_MEMORY_MODE=true
Ogcode is a single Go binary that embeds a SolidJS web UI and runs its own HTTP server.
┌─────────────┐ REST + SSE ┌──────────────┐
│ Web UI │ ◄────────────────► │ Go Server │
│ (SolidJS) │ │ (port 9595) │
└─────────────┘ └──────┬───────┘
│
┌─────────────────────┼─────────────────────┐
▼ ▼ ▼
┌─────────────┐ ┌─────────────┐ ┌──────────────┐
│ Agent Loop │ │ SQLite DB │ │ LLM Provider │
│ (Claude, │ │ (workspace │ │ (Anthropic, │
│ GPT, etc.) │ │ + config) │ │ OpenAI, ...) │
└─────────────┘ └─────────────┘ └──────────────┘
│
▼
┌─────────────┐ ┌─────────────┐ ┌──────────────┐
│ Knowledge │ │ Call │ │ Search │
│ Graph │ │ Graph │ │ Bridge │
│ (Memory) │ │ (Code Rel) │ │ (Web/JS) │
└─────────────┘ └─────────────┘ └──────────────┘
| Component | Responsibility |
|---|---|
| Agent Loop | Streaming LLM chat with tool execution (bash, read, write, edit, glob, grep, memory_recall, callgraph, deep_search) |
| Session Store | SQLite database for conversations, plans, tasks, and permissions |
| Git Worktrees | An isolated branch per task, so multiple agents work in parallel |
| Knowledge Graph | Persistent semantic memory with vector embeddings |
| Call Graph | Function-level code relationship tracking |
| Search Bridge | Playwright-based headless Chrome for web research |
Join the Ogcode community on Discord to ask questions, share feedback and feature ideas, and stay up to date with releases.
If Ogcode is useful to you, starring the repo helps more developers discover it.
Contributions are welcome — bug fixes, features, and documentation alike.
git checkout -b feature/my-improvement.Please ensure your code follows the existing Go style and passes go test ./....
For security concerns, please open an issue or reach out on Discord.
MIT License — see LICENSE for details.
Made with care by the Ogcode team and contributors.