Semantic Search
AI-powered search using vector embeddings for better relevance
Semantic Search
Search tasks and docs by meaning, not just keywords. Uses native ONNX Runtime (via Go bindings) for local embedding inference — no external API calls, no sidecar process required.
In v0.20, the embedding pipeline was rewritten from a Bun sidecar to native ONNX Go bindings, making search faster and simpler to deploy.
Architecture
Tasks/Docs → Chunker → ONNX Runtime (Go) → Vector Index
Query → ONNX Runtime (Go) → Hybrid Search → Results
Quick Start
# Enable during init
knowns init my-project
# → "Enable semantic search?" [y/n] → y
# → "Select model:" → gte-small (recommended)
# Or enable on existing project
knowns config set search.semantic.enabled true
knowns model download gte-small
knowns search --reindex
Model Management
ONNX models are stored at ~/.knowns/models/ (shared across projects). The Go binary loads models directly via ONNX Runtime — no Node.js or Python dependencies required.
| Model | Size | Speed | Best For |
|---|---|---|---|
gte-small | ~50MB | Fast | Most projects (recommended) |
all-MiniLM-L6-v2 | ~45MB | Fast | Large codebases |
gte-base | ~110MB | Medium | High accuracy |
bge-small-en-v1.5 | ~50MB | Fast | English text |
bge-base-en-v1.5 | ~110MB | Medium | English, high quality |
e5-small-v2 | ~50MB | Fast | General use |
knowns model list # List available models
knowns model download gte-small # Download model
knowns model set gte-small # Set for project
knowns model status # Check status
knowns model remove gte-small # Remove model
Search Usage
# Semantic search (hybrid mode)
knowns search "how to handle auth errors"
# Force keyword only
knowns search "auth error" --keyword
# Filter by type
knowns search "api design" --type doc
knowns search "login bug" --type task
knowns search "token rotation" --type memory
# Rebuild index
knowns search --reindex
# Check status
knowns search --status-check
Search Output
Results show similarity scores and match reasons:
#42 [in-progress] [high] (97%)
Implement JWT authentication for API
Matched by: semantic, keyword
DOC: guides/auth-patterns (92%)
Section: ## JWT Best Practices
Matched by: semantic
Configuration
In .knowns/config.json:
{
"settings": {
"semanticSearch": {
"enabled": true,
"model": "gte-small"
}
}
}
Config Options
| Key | Type | Description |
|---|---|---|
enabled | boolean | Enable semantic search |
model | string | Model ID (e.g., gte-small) |
huggingFaceId | string | Custom HuggingFace model |
dimensions | number | Embedding dimensions |
Indexing
Index auto-updates on create/update. Manual rebuild:
knowns search --reindex
What Gets Indexed
- Task title, description, acceptance criteria
- Task implementation plan and notes
- Document content (chunked by sections)
- Imported docs
Memory Retrieval
Both project and global memory stores contribute to memory retrieval:
knowns initandknowns syncprepare semantic memory retrieval stateknowns search --type memoryavoids letting unrelated doc/task chunks crowd out memory results- Runtime memory hooks can reuse semantic-backed memory candidates when available
Custom Models
Add custom HuggingFace models:
# Add custom model
knowns model add Xenova/bge-large-en-v1.5 --dims 1024 --tokens 512
# Download and use
knowns model download bge-large-en-v1.5
knowns model set bge-large-en-v1.5
knowns search --reindex
MCP Integration
// Search with mode parameter (v0.20 consolidated)
search({
action: "search",
query: "authentication",
mode: "hybrid", // "hybrid" | "semantic" | "keyword"
})
// Retrieve with citations
search({
action: "retrieve",
query: "authentication patterns",
})
Troubleshooting
| Issue | Fix |
|---|---|
| Model not found | knowns model download gte-small |
| Index stale | knowns search --reindex |
| Slow first search | Normal - model loads into memory |
| Search returns nothing | Check knowns search --status-check |
CLI Commands Reference
# Model management
knowns model # Show status
knowns model list # List all models
knowns model download <id> # Download model
knowns model set <id> # Set for project
knowns model status # Detailed status
knowns model add <hf-id> # Add custom model
knowns model remove <id> # Remove custom model
# Search
knowns search "<query>" # Hybrid search
knowns search "<query>" --keyword # Keyword only
knowns search --reindex # Rebuild index
knowns search --status-check # Check status
Semantic search
Semantic search giúp Knowns tìm theo ý nghĩa, không chỉ khớp keyword chính xác.
Lệnh chính
knowns model list
knowns model download multilingual-e5-small
knowns model set multilingual-e5-small
knowns search --status-check
knowns search --reindex
knowns search "how authentication works" --plain
Search modes
keywordsemantichybrid
Lưu ý
Nếu semantic components chưa sẵn sàng, search tự fallback về safe mode thay vì crash.