#cloudflare-workers

12 posts

tech deep-dive Mar 28, 2026

When Vector Search Matches by Name Instead of Grade: Attribute Conflation in RAG Systems

Query: 'I just sent Beauty in the Mirror 5.11b — recommend routes of similar difficulty.' The results came back full of routes with similar-sounding names, not similar grades. Root cause: dense embeddings compress multiple attributes into a single vector, and the rarity of the route name drowns out the grade signal. The fix: three layers of defense — metadata pre-filtering, query rewriting, and score fusion.

#rag #vector-search #embedding #cloudflare-workers #recommendation-system

tech guide Mar 27, 2026

Cloudflare D1: SQLite Relational Database at the Edge

D1 is Cloudflare's serverless SQLite database that binds directly to Workers, supports full SQL (JOINs, transactions), and handles automatic backups. It's well-suited for small-to-medium relational data needs — NobodyClimb uses it as its primary database.

#cloudflare-d1 #sqlite #serverless #edge #cloudflare-workers #database

tech guide Mar 27, 2026

Cloudflare KV: A Global Edge Key-Value Store

KV is Cloudflare's globally distributed key-value store. Reads are served from the nearest edge node with extremely low latency. It's ideal for caching, feature flags, and ephemeral data — but writes are eventually consistent.

#cloudflare-kv #key-value #cache #edge #cloudflare-workers

tech guide Mar 27, 2026

Cloudflare R2: An S3 Alternative with Zero Egress Fees

R2 is Cloudflare's object storage service — S3-compatible API, zero egress fees, and native Workers binding. Stop worrying about bandwidth bills for media-heavy applications.

#cloudflare-r2 #object-storage #s3 #cloudflare-workers

tech guide Mar 27, 2026

Cloudflare Workers: Not Lambda, Not Containers — It's V8 Isolates

Cloudflare Workers uses V8 Isolates instead of containers — no cold starts, global edge deployment, and direct access to D1, R2, KV, and AI via Bindings. Great for APIs, SSR, and lightweight backends; not suited for long-running tasks.

#cloudflare-workers #edge-compute #hono #wrangler #serverless

tech guide Mar 27, 2026

Hono: The Lightweight Web Framework Built for Edge Runtimes

Hono is a web framework designed specifically for edge runtimes like Cloudflare Workers, Deno, and Bun. It's an order of magnitude lighter than Express, natively supports Web Standard APIs, and is the go-to choice for edge environments.

#hono #cloudflare-workers #edge #web-framework

tech guide Mar 27, 2026

@opennextjs/cloudflare: Running Next.js on Cloudflare Workers

@opennextjs/cloudflare enables Next.js 15 App Router deployments on Cloudflare Workers — dynamic SSR runs in a Worker, static assets are served from Cloudflare Assets. Zero server management, but with clear feature limitations.

#opennextjs #cloudflare-workers #nextjs #deployment

tech guide Mar 24, 2026

From Mock to Real AI: Integrating Cloudflare Workers AI into action-maker

Upgraded action-maker from hardcoded mock data to live Cloudflare Workers AI generation. The architecture splits into Worker (AI only), Server (data storage), and Frontend (orchestration). Hit two gotchas along the way: Qwen3's thinking block and the Workers AI response format.

#cloudflare-workers #hono #workers-ai #qwen3 #langfuse #nextjs #postgresql

ai deep-dive Mar 12, 2026

Modular RAG Pipeline: Designing RAG as a Composable DAG

RAG doesn't have to be a rigid three-step process. It's a set of steps that can be dynamically enabled, skipped, or reordered. Pipeline as Code lets the system adapt its behavior without redeployment.

#rag #pipeline #architecture #modular #dag #cloudflare-workers

ai guide Mar 12, 2026

RAG Streaming: Using SSE to Display LLM Responses as They Generate

LLM generation takes 3-5 seconds, and waiting for the full response before displaying it makes for a terrible experience. SSE pushes tokens as they're generated, reducing time-to-first-character from 5 seconds to under 1 second.

#rag #streaming #sse #server-sent-events #cloudflare-workers #ux

ai guide Mar 12, 2026

RAG Quota System: Controlling LLM Costs with Dual Limits

Limiting request count alone is not enough — a single long query can consume ten times the tokens of a normal one. Dual quotas (request count + token count) are what truly control costs.

#rag #quota #rate-limiting #token-budget #cost-control #cloudflare-workers

tech deep-dive Mar 12, 2026

NobodyClimb: Building a Climbing Community Platform Entirely on Cloudflare

A climbing community platform where the web app, mobile app, and AI Q&A all run on Cloudflare — no dedicated servers.

#cloudflare-workers #nextjs #hono #rag #react-native #monorepo