For side projects, toy demos, and RAG prototypes, nobody wants to swipe a credit card on day one. This is a verified roundup of 40+ LLM inference providers still operating as of 2026/05, tiered by whether free resources auto-replenish or are one-time grants. Each entry notes credit-card requirements, supported models, paid starting prices, and catches. Chinese-origin providers including Zhipu GLM (permanently free), Doubao (2M tokens/day), Kimi, DashScope, and the Ollama local option are all included.
From $20/mo Pro to $200/mo Max 20x, Claude Code's Opus 4.6 delivers the strongest reasoning depth in the industry, and its Max plan's unlimited pricing saves heavy users over 90% compared to API costs.
Cursor CLI brings the IDE Agent into the terminal, supporting interactive TUI and headless modes, Plan/Ask/Agent three modes, Cloud Handoff, CI/CD integration, $20-200/mo.
Gemini CLI will be discontinued on 2026/06/18, with Antigravity CLI as the official successor. Before shutdown: free 60 req/min, 1,000 req/day, including Gemini 2.5 Pro and 1M token context window. Skills, Hooks, and Subagents can all be migrated.
Kiro's free plan includes 50 credits. Auto mode intelligently mixes models to save costs. Spec-Driven development upgrades vibe coding into traceable, structured workflows. Agent Hooks enable local CI/CD automation.
Codex is tied to ChatGPT subscriptions ($20-200/mo). GPT-5.4 + mini automatic routing is the highlight, and the CLI supports dual billing via Plan mode and API Key mode.