Semantic Search

AI-powered search using vector embeddings for better relevance

3 min read

Semantic Search

Search tasks and docs by meaning, not just keywords. Uses local AI models for privacy and offline capability.

Architecture

Tasks/Docs → Chunker → Embedding Model → Vector Index
Query → Embedding Model → Hybrid Search → Results

Quick Start

# Enable during init
knowns init my-project
# → "Enable semantic search?" [y/n] → y
# → "Select model:" → gte-small (recommended)
 
# Or enable on existing project
knowns config set search.semantic.enabled true
knowns model download gte-small
knowns search --reindex

Model Management

Models stored at ~/.knowns/models/ (shared across projects).

ModelSizeSpeedBest For
gte-small67MBFastMost projects (recommended)
all-MiniLM-L6-v280MBFastAlternative
gte-base220MBMediumLarge projects
knowns model list              # List available models
knowns model download gte-small # Download model
knowns model set gte-small      # Set for project
knowns model status             # Check status
knowns model remove gte-small   # Remove model

Search Usage

# Semantic search (hybrid mode)
knowns search "how to handle auth errors"
 
# Force keyword only
knowns search "auth error" --keyword
 
# Filter by type
knowns search "api design" --type doc
knowns search "login bug" --type task
 
# Rebuild index
knowns search --reindex
 
# Check status
knowns search --status-check

Search Output

Results show similarity scores and match reasons:

#42 [in-progress] [high] (97%)
  Implement JWT authentication for API
  Matched by: semantic, keyword

DOC: guides/auth-patterns (92%)
  Section: ## JWT Best Practices
  Matched by: semantic

Configuration

In .knowns/config.json:

{
  "settings": {
    "semanticSearch": {
      "enabled": true,
      "model": "gte-small"
    }
  }
}

Config Options

KeyTypeDescription
enabledbooleanEnable semantic search
modelstringModel ID (e.g., gte-small)
huggingFaceIdstringCustom HuggingFace model
dimensionsnumberEmbedding dimensions

Indexing

Index auto-updates on create/update. Manual rebuild:

knowns search --reindex

What Gets Indexed

  • Task title, description, acceptance criteria
  • Task implementation plan and notes
  • Document content (chunked by sections)
  • Imported docs

Custom Models

Add custom HuggingFace models:

# Add custom model
knowns model add Xenova/bge-large-en-v1.5 --dims 1024 --tokens 512
 
# Download and use
knowns model download bge-large-en-v1.5
knowns model set bge-large-en-v1.5
knowns search --reindex

MCP Integration

// Search with mode parameter
mcp__knowns__search({
  query: "authentication",
  mode: "hybrid"  // "hybrid" | "semantic" | "keyword"
})
 
// Rebuild index
mcp__knowns__reindex_search({})

Troubleshooting

IssueFix
Model not foundknowns model download gte-small
Index staleknowns search --reindex
Slow first searchNormal - model loads into memory
Search returns nothingCheck knowns search --status-check

CLI Commands Reference

# Model management
knowns model                    # Show status
knowns model list               # List all models
knowns model download <id>      # Download model
knowns model set <id>           # Set for project
knowns model status             # Detailed status
knowns model add <hf-id>        # Add custom model
knowns model remove <id>        # Remove custom model
 
# Search
knowns search "<query>"         # Hybrid search
knowns search "<query>" --keyword    # Keyword only
knowns search --reindex         # Rebuild index
knowns search --status-check    # Check status