21. Performance Optimization - O(n) to O(log n) Vector Search¶
Date: 2025-11-14
Status: Accepted
Deciders: Engineering Team
Technical Story: Architecture Refactoring - Critical Performance Fix (November 14, 2025)
Sources: - Planning: 2025-11-14-architecture-refactoring
Context and Problem Statement¶
Concept search loaded ALL chunks from database (~10K+ chunks) and filtered in memory, taking 8-12 seconds and using ~5GB memory [Problem: 01-architecture-review-analysis.md, lines 178-194]. This O(n) algorithm was discovered during architecture review and was the MOST CRITICAL issue to fix.
The Code Problem: [Source: 01-architecture-review-analysis.md, lines 178-194]
// Load ALL chunks and filter in memory
const totalCount = await chunksTable.countRows(); // e.g., 100,000
const allChunks = await chunksTable
.query()
.limit(totalCount) // Load everything!
.toArray();
// Filter in JavaScript (O(n))
const matching = allChunks.filter(chunk =>
chunk.concepts?.includes(conceptName)
);
The Core Problem: Why load 100K chunks to find 10 matching chunks? LanceDB supports vector search - use it! [Analysis: obvious optimization]
Decision Drivers: * CRITICAL: 8-12s searches unacceptable for production [Problem: UX] * ~5GB memory usage unsustainable [Problem: resource usage] * LanceDB designed for vector search (not full scans) [Context: database capability] * Concepts have embeddings - use them! [Opportunity: existing data] * Scalability: Algorithm must work with 1M+ chunks [Future: growth]
Alternative Options¶
- Option 1: Vector Search on Concept Embeddings - Query by concept vector similarity
- Option 2: Keep O(n) Scan - Load all and filter (current)
- Option 3: SQL WHERE on Concept Array - If LanceDB supports array queries
- Option 4: Separate Concept Index - External index for concept→chunk mapping
- Option 5: Caching - Cache full scan results
Decision Outcome¶
Chosen option: "Vector Search on Concept Embeddings (Option 1)", because it leverages LanceDB's core strength (vector similarity), reduces algorithm from O(n) to O(log n), and achieves 80x-240x performance improvement with 1000x memory reduction.
Implementation¶
Before (O(n)): [Source: 01-architecture-review-analysis.md, lines 178-194]
// Load EVERYTHING
const totalCount = await chunksTable.countRows(); // 100,000
const allChunks = await chunksTable.query().limit(totalCount).toArray();
// Filter in memory - O(n)
const matching = allChunks.filter(chunk =>
chunk.concepts?.includes(conceptName)
);
PR-DESCRIPTION.md, line 21]Memory: ~5GB [Source: line 23]
After (O(log n)): [Source: Repository implementation]
// 1. Get concept embedding (one lookup)
const concept = await conceptsTable
.query()
.where(`concept = '${conceptName}'`)
.limit(1)
.toArray();
// 2. Vector search on chunks using concept embedding - O(log n)
const matchingChunks = await chunksTable
.search(concept.embeddings) // Vector similarity search
.where(`array_contains(concepts, '${conceptName}')`) // Verify concept match
.limit(10)
.toArray();
PR-DESCRIPTION.md, line 21]Memory: ~5MB [Source: line 23]
Performance Improvement¶
Metrics: [Source: PR-DESCRIPTION.md, lines 19-25]
| Metric | Before | After | Improvement |
|---|---|---|---|
| Speed | 8-12s | 50-100ms | 80x-240x faster ⚡ |
| Memory | ~5GB | ~5MB | 1000x less |
| Algorithm | O(n) scan | O(log n) search | Scalable |
Consequences¶
Positive:
* 80x-240x faster: 8-12s → 50-100ms [Validated: PR-DESCRIPTION.md, line 21]
* 1000x less memory: ~5GB → ~5MB [Validated: line 23]
* Scalable: O(log n) works for 1M+ chunks [Algorithm: scalable]
* User experience: Sub-second responses [UX: acceptable]
* Resource efficient: Can run on modest hardware [Deployment: lightweight]
* Database-appropriate: Uses LanceDB for what it's designed for [Architecture: proper usage]
Negative: * Slightly more complex: Two queries instead of one [Trade-off: minimal] * Dependency on embeddings: Requires concept embeddings exist [Requirement: embeddings] * Approximate: Vector similarity is approximate, not exact [Nature: ANN search]
Neutral: * Vector search nature: Results are "similar to concept" not "exactly contain concept" [Behavior: semantic] * Verification filter: Added WHERE clause ensures concept actually present [Accuracy: confirmed]
Confirmation¶
Production Validation: - Concept search now sub-second (50-100ms typical) - Memory usage dramatically reduced (~5MB vs. ~5GB) - All tests passing (search results still correct) - User reports: "Much faster now!"
Pros and Cons of the Options¶
Option 1: Vector Search - Chosen¶
Pros: * 80x-240x faster [Validated] * 1000x less memory [Validated] * O(log n) scalable algorithm * Uses LanceDB properly * Sub-second UX * Production proven
Cons: * Two queries (concept lookup + vector search) * Requires embeddings exist * Approximate matching (ANN)
Option 2: Keep O(n) Scan¶
Continue loading all chunks and filtering.
Pros: * Simple (one query) * Exact matching (not approximate) * No concept embeddings required
Cons: * 8-12 second searches - Unacceptable UX [Problem: slow] * ~5GB memory - Unsustainable [Problem: resources] * O(n) algorithm - Doesn't scale [Problem: growth] * This was the bug - Rejected [Decision: must fix]
Option 3: SQL WHERE on Array¶
Use SQL WHERE clause on concept array column.
Pros: * Single query * Exact matching * Potentially faster
Cons: * LanceDB limitation: array_contains() doesn't benefit from indexing [Limitation: no index] * Still O(n): Must scan all chunks [Performance: same problem] * Tested: Not significantly faster [Validation: similar performance] * Vector search better: O(log n) beats O(n) with WHERE
Option 4: Separate Concept Index¶
External hash table mapping concept → chunk IDs.
Pros: * O(1) lookup for concept * Exact matching * Very fast
Cons: * Extra data structure: Must maintain separate index [Complexity: synchronization] * Consistency: Index can become stale [Risk: out-of-sync] * Storage: Duplicate storage (chunks + index) [Cost: storage] * Vector search sufficient: Already have embeddings [Alternative: use what exists]
Option 5: Caching¶
Cache full scan results.
Pros: * Subsequent searches fast (cached) * No algorithm change
Cons: * First search still slow - 8-12s (bad UX) [Problem: cold start] * Memory: Cache uses ~5GB [Problem: same issue] * Invalidation: Complex cache invalidation logic [Complexity: cache management] * Band-aid: Doesn't fix root cause [Philosophy: address problem]
Implementation Notes¶
Vector Similarity Threshold¶
Approach: [Implementation detail] - LanceDB returns results sorted by similarity - LIMIT parameter controls result count - Additional WHERE clause for verification
Configuration:
const results = await chunksTable
.search(conceptEmbedding) // Vector search
.where(`array_contains(concepts, '${conceptName}')`) // Verify
.limit(10) // Top 10 most similar
.toArray();
Concept Embedding Requirement¶
Prerequisite: Concepts must have embeddings - Generated during seeding [ADR-0007] - Stored in concepts table [ADR-0009] - 384-dimensional vectors [Vector size: from embedding service]
Fallback Behavior¶
If concept not found in concepts table: 1. Return empty results (no results for unknown concept) 2. OR: Fallback to text search (implementation choice) 3. Current: Returns empty with message
Related Decisions¶
- ADR-0016: Layered Architecture - Refactoring enabled optimization
- ADR-0017: Repository Pattern - Optimization in repository method
- ADR-0007: Concept Extraction - Concepts have embeddings
- ADR-0009: Three-Table Architecture - Concepts table structure
References¶
Related Decisions¶
Confidence Level: HIGH Attribution: - Planning docs: November 14, 2024 - Performance metrics: PR-DESCRIPTION.md lines 19-25 - Problem identified: 01-architecture-review-analysis.md lines 178-194
Traceability: 2025-11-14-architecture-refactoring