Skip to content
Series
6 posts

RAG 系統實戰

A structured path through production RAG design, from failure modes and ranking to multi-agent orchestration.

ai guide RAG 系統實戰

The Complete Guide to RAG System Patterns: A Ten-Generation Evolution from Naive to Multi-Agent with Practical Navigation

RAG has evolved far beyond simple 'search + generate' into a technology ecosystem spanning ten generations. This article is a systematic navigation guide: from Naive RAG to Multi-Agent RAG across ten generations, covering retrieval strategies, chunking, embedding, reranking, evaluation frameworks, observability, and cost optimization. Each topic has a dedicated deep-dive article.

ai debug RAG 系統實戰

RAG Common Failure Modes: 10 Problems and Their Solutions

When a RAG system breaks, 90% of the time it's one of these 10 failure modes. Identify which one first, then apply the matching fix — far more effective than optimizing blindly.

ai guide RAG 系統實戰

Hybrid Search: Using BM25 + Vector Search to Cover Each Other's Blind Spots

Vector search handles semantics; BM25 handles keywords. Combining them with RRF is what lets you handle both fuzzy queries and exact terms at the same time.

ai guide RAG 系統實戰

Multi-Agent RAG: Distributed Retrieval Architecture with Specialized Agent Collaboration

A single RAG Agent handling all queries hits knowledge boundaries and performance bottlenecks. Multi-Agent RAG dispatches retrieval tasks to multiple specialized Agents, each with its own knowledge base and retrieval strategy, coordinated by a central Orchestrator that merges results.

ai guide RAG 系統實戰

Building a Legal Contract RAG in 36 Hours: Weaviate Query Agent + ColQwen Architecture Breakdown

Using Weaviate Query Agent + ColQwen multi-vector model, a single prompt built a production-grade legal contract search system in 36 hours -- this post breaks down its architecture logic, technology choices, and what you actually need to watch out for.

PageIndex: RAG Without Vectors — Turning Long Documents Into a Book With a Table of Contents

PageIndex skips chunking, embedding, and vector storage entirely. Instead it relies on LLM reasoning over a tree-structured table of contents the LLM itself wrote, achieving 98.7% on FinanceBench (GPT-4o reading directly scores only 31%). It solves a different problem than vector RAG — finding the right section in a well-structured long document.