Skip to content
All tags

#cloudflare-workers

12 posts
tech deep-dive

When Vector Search Matches by Name Instead of Grade: Attribute Conflation in RAG Systems

Query: 'I just sent Beauty in the Mirror 5.11b — recommend routes of similar difficulty.' The results came back full of routes with similar-sounding names, not similar grades. Root cause: dense embeddings compress multiple attributes into a single vector, and the rarity of the route name drowns out the grade signal. The fix: three layers of defense — metadata pre-filtering, query rewriting, and score fusion.

tech guide

Cloudflare D1: SQLite Relational Database at the Edge

D1 is Cloudflare's serverless SQLite database that binds directly to Workers, supports full SQL (JOINs, transactions), and handles automatic backups. It's well-suited for small-to-medium relational data needs — NobodyClimb uses it as its primary database.

tech guide

Cloudflare KV: A Global Edge Key-Value Store

KV is Cloudflare's globally distributed key-value store. Reads are served from the nearest edge node with extremely low latency. It's ideal for caching, feature flags, and ephemeral data — but writes are eventually consistent.

tech guide

Cloudflare R2: An S3 Alternative with Zero Egress Fees

R2 is Cloudflare's object storage service — S3-compatible API, zero egress fees, and native Workers binding. Stop worrying about bandwidth bills for media-heavy applications.

tech guide

Cloudflare Workers: Not Lambda, Not Containers — It's V8 Isolates

Cloudflare Workers uses V8 Isolates instead of containers — no cold starts, global edge deployment, and direct access to D1, R2, KV, and AI via Bindings. Great for APIs, SSR, and lightweight backends; not suited for long-running tasks.

tech guide

Hono: The Lightweight Web Framework Built for Edge Runtimes

Hono is a web framework designed specifically for edge runtimes like Cloudflare Workers, Deno, and Bun. It's an order of magnitude lighter than Express, natively supports Web Standard APIs, and is the go-to choice for edge environments.

tech guide

@opennextjs/cloudflare: Running Next.js on Cloudflare Workers

@opennextjs/cloudflare enables Next.js 15 App Router deployments on Cloudflare Workers — dynamic SSR runs in a Worker, static assets are served from Cloudflare Assets. Zero server management, but with clear feature limitations.

tech guide

From Mock to Real AI: Integrating Cloudflare Workers AI into action-maker

Upgraded action-maker from hardcoded mock data to live Cloudflare Workers AI generation. The architecture splits into Worker (AI only), Server (data storage), and Frontend (orchestration). Hit two gotchas along the way: Qwen3's thinking block and the Workers AI response format.

ai deep-dive

Modular RAG Pipeline: Designing RAG as a Composable DAG

RAG doesn't have to be a rigid three-step process. It's a set of steps that can be dynamically enabled, skipped, or reordered. Pipeline as Code lets the system adapt its behavior without redeployment.

ai guide

RAG Streaming: Using SSE to Display LLM Responses as They Generate

LLM generation takes 3-5 seconds, and waiting for the full response before displaying it makes for a terrible experience. SSE pushes tokens as they're generated, reducing time-to-first-character from 5 seconds to under 1 second.

ai guide

RAG Quota System: Controlling LLM Costs with Dual Limits

Limiting request count alone is not enough — a single long query can consume ten times the tokens of a normal one. Dual quotas (request count + token count) are what truly control costs.

tech deep-dive

NobodyClimb: Building a Climbing Community Platform Entirely on Cloudflare

A climbing community platform where the web app, mobile app, and AI Q&A all run on Cloudflare — no dedicated servers.