#vector-search

6 posts

ai deep-dive Jun 4, 2026

Semantic Similarity ≠ Retrieval Relevance: Scenarios, Detection, and Remedies for Systematic Embedding Retrieval Failures

Cosine similarity and relevance systematically diverge across an entire class of scenarios: negation (most IR models score at or below random on NevIR), exact identifiers, numeric thresholds, and logical combinations (SoTA models achieve recall@100 < 20 on LIMIT) -- some of these hit the theoretical ceiling of the single-vector paradigm, and switching to a larger model will not help. Recommended remedy order: hybrid BM25 -> reranker (Anthropic measured -67%) -> upstream metadata routing -> domain fine-tuning / multi-vector.

#retrieval #embedding #rag #vector-search #llm

tech deep-dive Mar 28, 2026

When Vector Search Matches by Name Instead of Grade: Attribute Conflation in RAG Systems

Query: 'I just sent Beauty in the Mirror 5.11b — recommend routes of similar difficulty.' The results came back full of routes with similar-sounding names, not similar grades. Root cause: dense embeddings compress multiple attributes into a single vector, and the rarity of the route name drowns out the grade signal. The fix: three layers of defense — metadata pre-filtering, query rewriting, and score fusion.

#rag #vector-search #embedding #cloudflare-workers #recommendation-system

ai guide Mar 12, 2026

BGE-M3: Why This Embedding Model Works Well for Traditional Chinese RAG

Your choice of embedding model directly determines RAG search quality. BGE-M3's multilingual training, 1024-dimensional vectors, and matching Reranker make it a practical pick for Traditional Chinese RAG.

#rag #embedding #bge-m3 #multilingual #vector-search #cloudflare-workers-ai

ai guide RAG 系統實戰 Mar 12, 2026

Hybrid Search: Using BM25 + Vector Search to Cover Each Other's Blind Spots

Vector search handles semantics; BM25 handles keywords. Combining them with RRF is what lets you handle both fuzzy queries and exact terms at the same time.

#rag #hybrid-search #bm25 #vector-search #rrf #embedding

ai guide Mar 12, 2026

HyDE: Boosting Vector Search Recall with Hypothetical Answers

Have an LLM generate an 'ideal answer' first, then embed that hypothetical document for search — it outperforms searching with the raw query.

#rag #hyde #embedding #vector-search #query-enhancement

ai guide Mar 12, 2026

Semantic Caching: Run the RAG Pipeline Only Once for Semantically Similar Queries

Caching doesn't have to match exact query strings -- semantically similar questions can hit the cache too, skipping the entire RAG pipeline execution.

#rag #semantic-cache #caching #vector-search #performance