Structural Code Intelligence for AI Agents — MCP Server + Knowledge Graph + Semantic Search
Give your AI assistant a brain for your codebase. Octocode transforms your project into a navigable knowledge graph that Claude, Cursor, and other AI agents can search, understand, and navigate.
🚀 Quick Start • 🤖 MCP Integration • 📖 Documentation • 🌐 Website
<a href="https://glama.ai/mcp/servers/Muvon/octocode"> <img width="300" src="https://glama.ai/mcp/servers/Muvon/octocode/badge" alt="Octocode MCP server" /> </a> </div>🤖 Built for AI Agents
The Problem: AI assistants are blind to your codebase. They can't search your files, understand dependencies, or remember context across sessions.
The Solution: Octocode's MCP server gives AI agents:
- 🔍 Semantic search — Find code by meaning, not keywords
- 🕸️ Knowledge graph — Navigate imports, calls, and dependencies
- 📝 Code signatures — View structure without reading entire files
- 🧠 Persistent memory — Remember decisions across conversations
Works with: Claude Desktop • Cursor • Windsurf • Any MCP-compatible AI
// Add to your AI assistant config
{
"mcpServers": {
"octocode": {
"command": "octocode",
"args": ["mcp", "--path", "/your/project"]
}
}
}Now your AI assistant can:
You: "Where is authentication handled?"
AI: *searches your codebase* "Authentication is in src/middleware/auth.rs,
which imports jwt.rs for token validation and calls user_store.rs for lookup."
You: "What files depend on the payment module?"
AI: *queries knowledge graph* "src/api/handlers/payment.rs imports payment/mod.rs,
which is also used by src/workers/refund.rs and src/cron/billing.rs"
You: "Remember this bug fix for future reference"
AI: *stores in memory* "Got it. I'll remember this authentication bypass fix
and apply similar patterns when reviewing security code."🤔 Why Octocode?
Standard RAG treats your code as flat text chunks. It finds similar-sounding snippets but has no idea that auth_middleware.rs imports jwt.rs, calls user_store.rs, and is wired into router.rs. Octocode understands structure.
# Semantic search finds the right code
octocode search "authentication middleware"
→ src/middleware/auth.rs | Similarity 0.923
# GraphRAG reveals the full dependency chain
octocode graphrag get-relationships --node_id src/middleware/auth.rs
Outgoing:
imports → jwt (src/auth/jwt.rs): token validation logic
calls → user_store (src/db/user_store.rs): user lookup by token
Incoming:
imports ← router (src/router.rs): wires auth into the request pipelineOctocode uses tree-sitter AST parsing to extract real symbols (functions, imports, dependencies), builds a GraphRAG knowledge graph of relationships between files, and exposes everything via MCP — so AI tools can navigate your project architecture, not just search it.
🔬 How It Works
Source Code → Tree-sitter AST → Symbols & Relationships → Knowledge Graph
↓
Embeddings + Hybrid Search + Reranking → MCP Server- AST Parsing — tree-sitter extracts real code symbols (functions, classes, imports), not arbitrary text chunks
- Knowledge Graph — GraphRAG maps relationships between files:
imports,calls,implements,extends,configures, and 9 more types — each with importance weighting - Hybrid Search — semantic similarity + BM25 full-text search + reranking — not just vector embeddings
- MCP Server — exposes
semantic_search,view_signatures, andgraphragtools to any MCP-compatible client
✨ What Makes It Different
| Standard RAG | Doc Lookup Tools | Octocode | |
|---|---|---|---|
| Indexes | Text chunks | External library docs | Your codebase structure (AST) |
| Understands | Similar text | API specs & usage | Functions, imports, dependencies |
| Cross-file | No | No | Yes — navigates the dependency graph |
| Relationships | No | No | imports, calls, implements, extends... |
| AI integration | Varies | MCP | Native MCP server + LSP |
Doc tools give AI the manual for libraries you use. Octocode gives AI the blueprint of how you put them together.
Built with Rust for performance. Local-first for privacy. Open source (Apache 2.0) for transparency.
📊 Retrieval Quality
Octocode ships a reproducible retrieval benchmark (benchmark/): 127 curated code-search queries with line-range ground truth, run against octocode's own source (pinned at b1771ba so annotations never drift). The numbers below use a fully local, no-API-key stack — jina-embeddings-v2-base-code via fastembed, no reranker — so they are a floor, not a ceiling:
| Config | Hit@5 | Hit@10 | MRR | NDCG@10 | Recall@10 |
|---|---|---|---|---|---|
| Dense vector only | 0.598 | 0.717 | 0.485 | 0.528 | 0.671 |
| Hybrid, default RRF weights (0.7/0.3) | 0.598 | 0.717 | 0.485 | 0.528 | 0.671 |
| Hybrid, keyword-tuned (0.3/0.7) | 0.732 | 0.835 | 0.572 | 0.620 | 0.807 |
Tilting RRF fusion toward the BM25/keyword signal — which carries disproportionate weight for code's exact identifiers — lifts Hit@5 by +22% and Recall@10 by +20% at zero added cost.
The benchmark also flags what doesn't help here (full 6-variant matrix in benchmark/RESULTS.md): a generic local cross-encoder reranker (bge-reranker-base) actually regressed results (Hit@5 0.732 → 0.598) — code retrieval needs a code-aware reranker (e.g. voyage:rerank-2.5), not an off-the-shelf one.
git worktree add /tmp/corpus b1771ba # pin the corpus to the ground-truth commit
CORPUS=/tmp/corpus python3 benchmark/run_matrix.pySee benchmark/README.md for methodology and metric definitions.
🚀 Quick Start
1. Install
# Universal installer (Linux, macOS, Windows)
curl -fsSL https://raw.githubusercontent.com/Muvon/octocode/master/install.sh | sh
# macOS with Homebrew
brew install muvon/tap/octocode# Cargo (build from source)
cargo install --git https://github.com/Muvon/octocode
# Download binary from releases
# https://github.com/Muvon/octocode/releasesSee Installation Guide for platform-specific instructions.
</details>2. Set Up API Keys
# Required: Embedding provider (Voyage AI has 200M free tokens/month)
export VOYAGE_API_KEY="your-voyage-api-key"
# Optional: LLM for commit messages, code review
export OPENROUTER_API_KEY="your-openrouter-api-key"Get your Voyage API key: voyageai.com (free tier available)
<details> <summary><strong>Other embedding providers</strong></summary>Octocode supports multiple embedding providers:
# OpenAI
export OPENAI_API_KEY="your-key"
octocode config --code-embedding-model "openai:text-embedding-3-small"
# Jina AI
export JINA_API_KEY="your-key"
octocode config --code-embedding-model "jina:jina-embeddings-v3"
# Google
export GOOGLE_API_KEY="your-key"
octocode config --code-embedding-model "google:text-embedding-005"See API Keys guide for all supported providers.
</details>3. Index Your Codebase
cd /your/project
octocode index
# → Indexed 12,847 blocks across 342 files4. Search Your Code
# Natural language search
octocode search "authentication middleware"
# Multi-query for broader results
octocode search "auth" "middleware" "session"
# Filter by language
octocode search "database connection pool" --lang rust
# Search commit history
octocode search "authentication refactor" --mode commits5. Connect Your AI Assistant
Add to your MCP client config (Claude Desktop, Cursor, Windsurf):
{
"mcpServers": {
"octocode": {
"command": "octocode",
"args": ["mcp", "--path", "/your/project"]
}
}
}Done! Your AI assistant now understands your codebase structure.
🔌 MCP Server Integration
Octocode includes a built-in MCP server that exposes your codebase as tools to AI assistants. This is the primary way to use Octocode — give your AI assistant direct access to search and navigate your code.
Available Tools
| Tool | What It Does |
|---|---|
semantic_search | Find code by meaning — "authentication flow", "error handling", "database queries" |
view_signatures | View file structure — function signatures, class definitions, imports |
graphrag | Query relationships — "what calls this function?", "what does this module import?" |
structural_search | AST pattern matching — find .unwrap() calls, new instantiations, specific patterns |
Conversational AI Examples
Once connected, your AI assistant can answer questions about your codebase:
You: "Where is user authentication implemented?"
AI: *uses semantic_search* "Found in src/auth/login.rs. The authenticate() function
validates credentials against the database, generates a JWT token, and stores
the session in Redis."
You: "What files depend on the payment module?"
AI: *uses graphrag* "src/api/handlers/payment.rs imports payment/mod.rs, which is also
used by src/workers/refund.rs and src/cron/billing.rs. The payment module exports
process_payment() and validate_transaction() functions."
You: "Show me all error handling in the API layer"
AI: *uses structural_search* "Found 23 error handling patterns in src/api/:
- 15 use Result<T, ApiError> with explicit error types
- 8 use .unwrap() (potential panics in handlers/user.rs:42, handlers/auth.rs:87)
- 3 use .expect() with custom messages"Quick Setup
Octomind (Recommended) — Zero setup, Octocode pre-configured:
curl -fsSL https://raw.githubusercontent.com/muvon/octomind/master/install.sh | bash
octomind run developer:rustClaude Code (CLI) — Command-line setup:
claude mcp add octocode -- octocode mcp --path /path/to/your/projectClaude Desktop / Cursor / Windsurf — Add to config:
{
"mcpServers": {
"octocode": {
"command": "octocode",
"args": ["mcp", "--path", "/path/to/your/project"]
}
}
}Config locations:
- Claude Desktop:
~/Library/Application Support/Claude/claude_desktop_config.json(macOS) - Cursor:
~/.cursor/mcp.jsonor Settings → MCP Servers - Windsurf: Settings → MCP
📖 Complete MCP Client Setup Guide — Detailed instructions for 15+ clients including VS Code (Cline/Continue), Zed, Replit, and more.
🎯 What Can You Do With It?
New developer onboarding:
You: "How does the authentication system work?"
AI: *searches and navigates* "Authentication starts in src/middleware/auth.rs which
validates JWT tokens. It calls src/auth/jwt.rs for token verification, which uses
the public key from config. Failed auth returns 401 via src/errors/auth_error.rs.
Sessions are stored in Redis via src/cache/session.rs."Code archaeology:
You: "Find all places we handle database errors"
AI: *structural search* "Found 47 error handling patterns:
- 32 use R
…