#gemma

4 篇文章

ai guide 2026年4月28日

Gemma on Cloudflare Workers AI：繁中應用的務實選擇

在 Cloudflare Workers AI 上跑 LLM，gemma-3-12b-it 的繁體中文指令跟隨比 llama-3.1-8b-instruct 明顯更好；2026 年 Gemma 4 上線後多了 Vision、Function calling 與 256K context，視需求升級。

#gemma #cloudflare-workers-ai #llm #traditional-chinese

ai guide 2026年3月31日

能在手機上跑的小模型：2026 年的選擇與限制

2026 年行動端 LLM 主力是 Gemma 3n、Qwen 3.5 Small、Llama 3.2、Phi-4-mini、Ministral 3 和 SmolLM3。3B 以下量化模型在 8GB RAM 手機上能跑到 30–50 tokens/sec，但 RAM、散熱和 context window 仍是硬限制。

#on-device-ai #small-models #mobile #quantization #llama #gemma #phi #qwen #mistral #smollm #mobilellm

ai project 2026年3月31日

2026 Q1 開源 LLM 全景圖：從前沿大模型到手機端，完整盤點

2026 Q1 開源模型全面爆發：LLM 方面 GLM-5、Kimi K2.5、Qwen3.5 追上閉源；Embedding 和 Reranker 由 Qwen3 和 BGE 主導；語音有 Voxtral TTS 和 Whisper V3；圖像有 FLUX.2；影片有 Wan 2.2 追平 Sora。這篇是完整導覽地圖。

#open-source #llm #glm-5 #kimi #deepseek #qwen #llama #gemma #mistral #minimax #phi #smollm #gpt-oss #moe #on-device-ai #embedding #reranker #tts #stt #image-generation #video-generation #code-model #ollama #vllm

tech deep-dive 2026年3月12日

NobodyClimb AI 架構：在 Cloudflare Workers 上打造 20 節點 RAG Pipeline

用 Cloudflare Workers AI（gemma-3-12b-it + bge-m3）打造可動態組裝的 RAG pipeline，14 個基礎 step + 6 個 LangGraph 專屬節點，三種策略圖（Baseline / Agentic / Plan-Execute）動態切換。

#rag #cloudflare-workers-ai #llm #pipeline #gemma #embedding #hono #langgraph