#cloudflare-workers-ai

5 posts

ai May 9, 2026

2026 LLM Inference Provider Free Tiers & Pricing: 40+ Services Ranked by Tier

For side projects, toy demos, and RAG prototypes, nobody wants to swipe a credit card on day one. This is a verified roundup of 40+ LLM inference providers still operating as of 2026/05, tiered by whether free resources auto-replenish or are one-time grants. Each entry notes credit-card requirements, supported models, paid starting prices, and catches. Chinese-origin providers including Zhipu GLM (permanently free), Doubao (2M tokens/day), Kimi, DashScope, and the Ollama local option are all included.

#llm #inference #pricing #free-tier #cerebras #groq #cloudflare-workers-ai #gemini #openrouter #deepseek #nvidia-nim #modal #ollama #mistral

ai guide Apr 28, 2026

Gemma on Cloudflare Workers AI: A Pragmatic Choice for Traditional Chinese Applications

For running LLMs on Cloudflare Workers AI, gemma-3-12b-it follows Traditional Chinese instructions noticeably better than llama-3.1-8b-instruct. With Gemma 4 arriving in 2026, you get Vision, Function calling, and 256K context -- upgrade as needed.

#gemma #cloudflare-workers-ai #llm #traditional-chinese

tech guide Apr 17, 2026

The Full Picture of Cloudflare Workers AI Binding: It's More Than Just run()

env.AI is not just run(). It also exposes toMarkdown (document-to-Markdown conversion), autorag (managed RAG), gateway (external provider proxy), and models (metadata lookup). Understanding these four method groups is what unlocks Cloudflare as a full AI platform inside Workers.

#cloudflare-workers-ai #cloudflare #rag #ai-gateway #tomarkdown

ai guide Mar 12, 2026

BGE-M3: Why This Embedding Model Works Well for Traditional Chinese RAG

Your choice of embedding model directly determines RAG search quality. BGE-M3's multilingual training, 1024-dimensional vectors, and matching Reranker make it a practical pick for Traditional Chinese RAG.

#rag #embedding #bge-m3 #multilingual #vector-search #cloudflare-workers-ai

tech deep-dive Mar 12, 2026

NobodyClimb AI Architecture: Building a 20-Node RAG Pipeline on Cloudflare Workers

A dynamically composable RAG pipeline built on Cloudflare Workers AI (gemma-3-12b-it + bge-m3): 14 base steps + 6 LangGraph-specific nodes, with three strategy graphs (Baseline / Agentic / Plan-Execute) selected at runtime.

#rag #cloudflare-workers-ai #llm #pipeline #gemma #embedding #hono #langgraph