BEAM

Features

Rankings

Pro

Docs

GitHub

LAUNCH APP

Use OpenRouter Models in Big-AGI.

Bring your own key: OpenRouter's API rates, no markup. Keys and chats stay in your browser. Run OpenRouter in parallel with other models, then compare and merge the answers.

Gemini 3.6 Flash

Gemini 3.5 Flash Lite

Laguna S 2.1 (free) ·

Launch Big-AGI

OpenRouter gives you the models. Big-AGI adds the workspace.

OpenRouter is the routing layer: one key, hundreds of models, automatic fallback. It is not where you organize the work. Point Big-AGI at that key and the catalog turns into an expert workspace.

Capability	Through your OpenRouter key	Inside Big-AGI
Model catalog	Hundreds, one key	The same catalog, same key
Reasoning and parameter control	Router-flattened	Vendor-accurate ranges
Merging multiple answers	Basic Fusion	Advanced and custom merges
Personas and saved prompts	Not the router's job	Built in, with memory
Chat history and attachments	Bring your own	Local-first, yours
Request transparency	API-level	AI Inspector on every call

Start for $0

You do not need credits to begin. OpenRouter serves 20+ free models, including Llama, Qwen, and GPT-OSS, and Big-AGI runs them the moment your key is linked. Add a Google key for the Gemini free tier and you have a second vendor at no cost. Free usage runs under OpenRouter's own daily and per-minute limits, so it is built for trying things, not production loads. When you outgrow it, the same key unlocks every paid model at OpenRouter's rates.

All supported OpenRouter models

ModelContextInputOutputReleased

Gemini 3.6 Flash

NEW

Gemini 3.6 Flash is a high-efficiency model from Google for coding, agentic workflows, and web and app development. It is designed to produce polished outputs…

$1.5

$7.5

Jul 2026

Gemini 3.5 Flash Lite

NEW

Gemini 3.5 Flash Lite is a high-efficiency model from Google with upgraded agentic capabilities. It is suited for subagents that execute focused tasks within c…

$0.3

$2.5

Jul 2026

Laguna S 2.1 (free) ·

NEW

Laguna S 2.1 is the latest coding agent model from Poolside. Laguna S 2.1 is a 118B total parameter model with 8B active parameters, scoring 70.2% on Terminal-…

262K

Jul 2026

LongCat 2.0

NEW

LongCat 2.0 is a sparse mixture-of-experts language model from Meituan, with 48B active parameters out of 1.6T total. It is suited for coding, repository-level…

$0.3

$1.2

Jul 2026

Inkling

NEW

Inkling is an open-weight multimodal mixture-of-experts model from Thinking Machines Lab, with 41B active parameters out of 975B total. It is designed for gene…

524K

$4.05

Jul 2026

Auto Router (Beta)

NEW

Auto Router (Beta) is a task-aware router from OpenRouter. It classifies each request, then routes it the most popular model for that task based on aggregate s…

Jul 2026

Kimi K3

NEW

Kimi K3 is a 2.8T parameter open-weight multimodal reasoning model from Moonshot AI. It is suited for complex coding, knowledge work, and long-horizon agentic…

$15

Jul 2026

Muse Spark 1.1

NEW

Muse Spark 1.1 is a multimodal reasoning model from Meta, built for agentic tasks. It accepts text, images, video, audio, and PDF documents and returns text, w…

$1.25

$4.25

Jul 2026

KAT-Coder-Air V2.5 (free) ·

NEW

KAT-Coder-Air V2.5 is a flagship-level Agentic Coding model that can directly hand over an entire issue or an entire business workflow to it, allowing it to au…

256K

Jul 2026

KAT-Coder-Pro V2.5

NEW

KAT-Coder-Pro V2.5 is a flagship-level Agentic Coding model that can directly hand over an entire issue or an entire business workflow to it, allowing it to au…

256K

$0.74

$2.96

Jul 2026

GPT-5.6 Luna Pro

NEW

GPT-5.6 Luna Pro is the same underlying model as GPT-5.6 Luna, served with `reasoning.mode` set to `pro` for higher-quality responses on complex tasks. Learn m…

1.1M

Jul 2026

GPT-5.6 Sol Pro

NEW

GPT-5.6 Sol Pro is the same underlying model as GPT-5.6 Sol, served with `reasoning.mode` set to `pro` for higher-quality responses on complex tasks. Learn mor…

1.1M

$30

Jul 2026

GPT-5.6 Terra Pro

NEW

GPT-5.6 Terra Pro is the same underlying model as GPT-5.6 Terra, served with `reasoning.mode` set to `pro` for higher-quality responses on complex tasks. Learn…

1.1M

$2.5

$15

Jul 2026

Grok 4.5

NEW

Grok 4.5 is SpaceXAI's smartest model with frontier performance on coding, knowledge work, and STEM.

500K

Jul 2026

Grok Latest

NEW

This model always redirects to the latest Grok model from xAI.

500K

Jul 2026

Aion-3.0

NEW

Aion-3.0 is a multi-model roleplaying and storytelling system from AionLabs, built on the GLM family of models. It uses a collaborative generation process in w…

131K

Jul 2026

Aion-3.0-Mini

NEW

Aion-3.0 Mini is a multi-model roleplaying and storytelling system from AionLabs, built on the DeepSeek family of models. It uses a collaborative generation pr…

131K

$0.7

$1.4

Jul 2026

Hy3

NEW

Hy3 is a 295B-parameter Mixture-of-Experts model from Tencent (21B active, 192 experts with top-8 routing) built for reasoning, agentic workflows, and real-wor…

262K

$0.14

$0.58

Jul 2026

Laguna XS 2.1 (free) ·

NEW

Laguna XS 2.1 is the latest coding agent model in the 33B-A3B category from Poolside and a step forward from their Laguna XS.2 model (released in April 2026).…

262K

Jul 2026

Nano Banana 2 Lite (Gemini 3.1 Flash Lite Image)

NEW

Nano Banana 2 Lite (Gemini 3.1 Flash Lite Image) is Google's fastest, most cost-efficient Gemini image model, built for high-velocity developer pipelines and r…

66K

$0.25

$1.5

Jun 2026

Claude Sonnet 5

NEW

Sonnet 5 is Anthropic's most capable Sonnet-class model, with frontier performance across coding, agents, and professional work. It supports adaptive thinking…

$10

Jun 2026

GPT-5.6 Sol

NEW

GPT-5.6 Sol is the flagship model in OpenAI's GPT-5.6 series. It is suited for complex reasoning, coding, and agentic workflows, and is particularly strong at…

1.1M

$30

Jun 2026

GPT-5.6 Terra

NEW

GPT-5.6 Terra is a balanced model in OpenAI's GPT-5.6 series, positioned between the flagship Sol tier and the cost-efficient Luna tier. It is suited for every…

1.1M

$2.5

$15

Jun 2026

GPT-5.6 Luna

NEW

GPT-5.6 Luna is a fast, cost-efficient model in OpenAI's GPT-5.6 series. It is suited for high-volume, latency-sensitive tasks such as chat, classification, an…

1.1M

Jun 2026

Fugu Ultra

NEW

Fugu Ultra is the higher-performance model in Sakana AI's Fugu family. Rather than a single monolithic model, Fugu is a learned multi-agent orchestration syste…

$30

Jun 2026

Nex-N2-Mini

NEW

Nex-N2-Mini is an open-source agentic mixture-of-experts model from Nex AGI, the smaller sibling in the Nex-N2 series. It accepts text and image input and is b…

262K

$0.03

$0.1

Jun 2026

North Mini Code (free) ·

NEW

North Mini Code is Cohere's first agentic coding model and the debut of its North family. A sparse mixture-of-experts model with 30B total parameters and 3B ac…

256K

Jun 2026

GLM 5.2

NEW

GLM 5.2 is a large-scale reasoning model from Z.ai. It supports text input and output with a 1M-token context window, and is suited for long-horizon agent work…

$0.8

$2.52

Jun 2026

Fusion

NEW

Fusion turns your prompt into a small multi-model deliberation. A panel of expert models (see below) analyzes your prompt in parallel with web search and web f…

Jun 2026

Claude Fable 5

NEW

Claude Fable 5 is a Mythos-class model from Anthropic, built for autonomous knowledge work and coding. It supports text, image, and file inputs with text outpu…

$10

$50

Jun 2026

Claude Fable Latest

NEW

This model always redirects to the latest model in the Claude Fable family.

$10

$50

Jun 2026

Nex-N2-Pro

NEW

Nex-N2-Pro is an agentic mixture-of-experts model from Nex AGI, with 17B active parameters out of 397B total. Built on the Qwen3.5 architecture, it accepts tex…

262K

$0.25

Jun 2026

Nemotron 3 Ultra

NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model from NVIDIA, with 55B active parameters out of 550B total (MoE). Built on a hybri…

512K

$0.6

$3.6

Jun 2026

Nemotron 3.5 Content Safety (free) ·

NVIDIA Nemotron 3.5 Content Safety is a compact 4B-parameter multimodal guardrail model from NVIDIA, fine-tuned from Google Gemma-3-4B. It moderates both input…

128K

Jun 2026

Qwen3.7 Plus

Qwen3.7-Plus is a cost-effective model in Alibaba's Qwen3.7 series. It supports text and image input with text output, building on the series' text capabilitie…

$0.32

$1.28

Jun 2026

Kimi K2.7 Code

MoonshotAI: Kimi K2.7 Code is a coding-focused model in Moonshot AI's Kimi K2 family, built to complete end-to-end programming tasks reliably over long context…

262K

$0.82

$3.75

Jun 2026

MiniMax M3

MiniMax-M3 is a multimodal foundation model from MiniMax. It supports text, image, and video inputs with text output, a 1M-token context window, and is suited…

$0.3

$1.2

May 2026

Claude Opus 4.8

Claude Opus 4.8 is Anthropic's most capable generally available model in the Opus family. It supports text, image, and file inputs with text output, with reaso…

$25

May 2026

Nano Banana 2 (Gemini 3.1 Flash Image)

Gemini 3.1 Flash Image, a.k.a. "Nano Banana 2," is Google’s latest state of the art image generation and editing model, delivering Pro-level visual quality at…

131K

$0.5

May 2026

Step 3.7 Flash

Step 3.7 Flash is StepFun's latest high-efficiency multimodal Mixture-of-Experts model. It pairs a 196B-parameter language backbone with a vision encoder for n…

262K

$0.2

$1.15

May 2026

Nano Banana Pro (Gemini 3 Pro Image)

Nano Banana Pro is Google’s most advanced image-generation and editing model, built on Gemini 3 Pro. It extends the original Nano Banana with significantly imp…

131K

$12

May 2026

Claude Opus 4.8 (Fast)

Fast-mode variant of Opus 4.8 - identical capabilities with higher output speed at 2x pricing relative to regular Opus 4.8. Learn more in Anthropic's docs: htt…

$10

$50

May 2026

Qwen3.7 Max

Qwen3.7-Max is the flagship model in Alibaba's Qwen3.7 series. It supports text input and output and is designed for agent-centric workloads, with particular s…

$1.48

$4.43

May 2026

Grok Build 0.1

Grok Build 0.1 is xAI’s fast coding model trained specifically for agentic software engineering workflows. It supports text and image inputs with text output,…

256K

May 2026

Gemini 3.5 Flash

Gemini 3.5 Flash is Google's high-efficiency multimodal model, bringing near-Pro level coding and reasoning at Flash-tier cost and speed. It is highly optimize…

$1.5

May 2026

Perceptron Mk1

Perceptron Mk1 (Mark One) is Perceptron's highest-quality vision-language model for video and embodied reasoning.** It accepts image and video inputs paired wi…

33K

$0.15

$1.5

May 2026

Claude Opus 4.7 (Fast)

Fast-mode variant of Opus 4.7 - identical capabilities with higher output speed at premium 6x pricing. Learn more in Anthropic's docs: https://platform.claude.…

$30

$150

May 2026

Ring-2.6-1T

Ring-2.6-1T is a 1T-parameter-scale thinking model with 63B active parameters, built for real-world agent workflows that require both strong capability and ope…

262K

$0.08

$0.63

May 2026

Gemini 3.1 Flash Lite

Gemini 3.1 Flash Lite is Google’s GA high-efficiency multimodal model optimized for low-latency, high-volume workloads. It supports text, image, video, audio,…

$0.25

$1.5

May 2026

GPT Chat Latest

400K

$30

May 2026

Mistral Medium 3.5

Mistral Medium 3.5 is a dense 128B instruction-following model from Mistral AI. It supports text and image inputs with text output, and is designed for agentic…

262K

$1.5

$7.5

Apr 2026

Granite 4.1 8B

Granite 4.1 8B is a dense, decoder-only 8-billion-parameter language model from IBM, part of the Granite 4.1 family. It supports a 131K-token context window an…

131K

$0.05

$0.1

Apr 2026

Laguna M.1

Laguna M.1 is the flagship coding agent model from Poolside, optimized for complex software engineering tasks. Designed for agentic coding workflows, it suppor…

262K

$0.2

$0.4

Apr 2026

Nemotron 3 Nano Omni (free) ·

NVIDIA Nemotron™ 3 Nano Omni is a 30B-A3B open multimodal model designed to function as a perception and context sub-agent in enterprise agent systems. It acce…

256K

Apr 2026

Owl Alpha ·

Owl Alpha is a high-performance foundation model designed for agentic workloads. Natively supports tool use, and long-context tasks, with strong performance in…

Apr 2026

Laguna XS.2

Laguna XS.2 is the second-generation model in the XS size class from Poolside, their efficient coding agent series. It combines tool calling and reasoning capa…

262K

$0.1

$0.2

Apr 2026

Google Gemini Flash Latest

This model always redirects to the latest model in the Google Gemini Flash family.

$1.5

$7.5

Apr 2026

Google Gemini Pro Latest

This model always redirects to the latest model in the Google Gemini Pro family.

$12

Apr 2026

Qwen3.6 27B

Qwen3.6 27B is a dense 27-billion-parameter language model from the Qwen Team at Alibaba, released in April 2026. It features hybrid multimodal capabilities —…

262K

$0.45

$2.7

Apr 2026

Qwen3.6 Flash

Qwen3.6 Flash is a fast, efficient language model from Alibaba's Qwen 3.6 series. It supports text, image, and video input with a 1M token context window. Tier…

$0.19

$1.13

Apr 2026

Qwen3.6 35B A3B

Qwen3.6-35B-A3B is an open-weight multimodal model from Alibaba Cloud with 35 billion total parameters and 3 billion active parameters per token. It uses a hyb…

262K

$0.14

Apr 2026

Qwen3.6 Max Preview

Qwen3.6-Max-Preview is a proprietary frontier model from Alibaba Cloud built on a sparse mixture-of-experts architecture with approximately 1 trillion total pa…

262K

$1.04

$6.24

Apr 2026

OpenAI GPT Latest

This model always redirects to the latest model in the OpenAI GPT family.

1.1M

$30

Apr 2026

MoonshotAI Kimi Latest

This model always redirects to the latest model in the MoonshotAI Kimi family.

$15

Apr 2026

OpenAI GPT Mini Latest

This model always redirects to the latest model in the OpenAI GPT Mini family.

400K

$0.75

$4.5

Apr 2026

Anthropic Claude Sonnet Latest

This model always redirects to the latest model in the Anthropic Claude Sonnet family.

$10

Apr 2026

Anthropic Claude Haiku Latest

This model always redirects to the latest model in the Anthropic Claude Haiku family.

200K

Apr 2026

Qwen3.5 Plus 2026-04-20

Qwen3.5 Plus (April 2026) is a large-scale multimodal language model from Alibaba. It accepts text, image, and video input and produces text output, with a 1M…

$0.3

$1.8

Apr 2026

DeepSeek V4 Pro

DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total parameters and 49B activated parameters, supporting a 1M-token context…

$0.44

$0.87

Apr 2026

DeepSeek V4 Flash

DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-to…

$0.1

$0.2

Apr 2026

GPT-5.5

GPT-5.5 is OpenAI’s frontier model designed for complex professional workloads, building on GPT-5.4 with stronger reasoning, higher reliability, and improved t…

1.1M

$30

Apr 2026

Ling-2.6-1T

Ling-2.6-1T is an instant (instruct) model from inclusionAI and the company’s trillion-parameter flagship, designed for real-world agents that require fast exe…

262K

$0.08

$0.63

Apr 2026

GPT-5.5 Pro

GPT-5.5 Pro is OpenAI’s high-capability model optimized for deep reasoning and accuracy on complex, high-stakes workloads. It features a 1M+ token context wind…

1.1M

$30

$180

Apr 2026

MiMo-V2.5

MiMo-V2.5 is a native omnimodal model by Xiaomi. It delivers Pro-level agentic performance at roughly half the inference cost, while surpassing MiMo-V2-Omni in…

1.1M

$0.14

$0.28

Apr 2026

MiMo-V2.5-Pro

MiMo-V2.5-Pro is Xiaomi’s flagship model, delivering strong performance in general agentic capabilities, complex software engineering, and long-horizon tasks,…

1.1M

$0.44

$0.87

Apr 2026

Hy3 preview

Hy3 preview is a high-efficiency Mixture-of-Experts model from Tencent designed for agentic workflows and production use. It supports configurable reasoning le…

262K

$0.06

$0.21

Apr 2026

Ling-2.6-flash

Ling-2.6-flash is an instant (instruct) model from inclusionAI with 104B total parameters and 7.4B active parameters, designed for real-world agents that requi…

262K

$0.01

$0.03

Apr 2026

GPT-5.4 Image 2

GPT-5.4 Image 2 combines OpenAI's GPT-5.4 model with state-of-the-art image generation capabilities from GPT Image 2. It enables rich multimodal workflows, all…

272K

$15

Apr 2026

Pareto Code Router

The Pareto Router maintains a tiered shortlist of strong coding models, ranked by Artificial Analysis coding percentiles. Set min_coding_score between 0 and 1…

Apr 2026

Claude Opus Latest

This model always redirects to the latest model in the Claude Opus family.

$25

Apr 2026

Kimi K2.6

Kimi K2.6 is Moonshot AI's next-generation multimodal model, designed for long-horizon coding, coding-driven UI/UX generation, and multi-agent orchestration. I…

262K

$0.68

$3.42

Apr 2026

Grok 4.3

Grok 4.3 is a reasoning model from xAI. It accepts text and image inputs with text output, and is suited for agentic workflows, instruction-following tasks, an…

$1.25

$2.5

Apr 2026

Claude Opus 4.7

Opus 4.7 is the next generation of Anthropic's Opus family, built for long-running, asynchronous agents. Building on the coding and agentic strengths of Opus 4…

$25

Apr 2026

GLM 5.1

GLM-5.1 delivers a major leap in coding capability, with particularly significant gains in handling long-horizon tasks. Unlike previous models built around min…

205K

$0.97

$3.04

Apr 2026

Gemma 4 26B A4B

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token du…

262K

$0.07

$0.34

Apr 2026

Gemma 4 31B

Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output. Features a 256K token context window,…

262K

$0.12

$0.37

Apr 2026

Qwen3.6 Plus

Qwen 3.6 Plus builds on a hybrid architecture that combines efficient linear attention with sparse mixture-of-experts routing, enabling strong scalability and…

$0.33

$1.95

Apr 2026

Trinity Large Thinking

Trinity Large Thinking is a powerful open source reasoning model from the team at Arcee AI. It shows strong performance in PinchBench, agentic workloads, and r…

262K

$0.25

$0.8

Apr 2026

GLM 5V Turbo

GLM-5V-Turbo is Z.ai’s first native multimodal agent foundation model, built for vision-based coding and agent-driven tasks. It natively handles image, video,…

203K

$1.2

Apr 2026

Grok 4.20

Grok 4.20 is a reasoning model from xAI with industry-leading speed and agentic tool calling capabilities. It combines the lowest hallucination rate on the mar…

$1.25

$2.5

Mar 2026

Grok 4.20 Multi-Agent

Grok 4.20 Multi-Agent is a variant of xAI’s Grok 4.20 designed for collaborative, agent-based workflows. Multiple agents operate in parallel to conduct deep re…

$1.25

$2.5

Mar 2026

Lyria 3 Pro Preview ·

Full-length songs are priced at $0.08 per song. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, you can…

Mar 2026

Lyria 3 Clip Preview ·

30 second duration clips are priced at $0.04 per clip. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, y…

Mar 2026

KAT-Coder-Pro V2

KAT-Coder-Pro V2 is the latest high-performance model in KwaiKAT’s KAT-Coder series, designed for complex enterprise-grade software engineering and SaaS integr…

262K

$0.3

$1.2

Mar 2026

Reka Edge

Reka Edge is an extremely efficient 7B multimodal vision-language model that accepts image/video+text inputs and generates text outputs. This model is optimize…

16K

$0.1

Mar 2026

MiniMax M2.7

MiniMax-M2.7 is a next-generation large language model designed for autonomous, real-world productivity and continuous improvement. Built to actively participa…

205K

$0.25

Mar 2026

GPT-5.4 Mini

GPT-5.4 mini brings the core capabilities of GPT-5.4 to a faster, more efficient model optimized for high-throughput workloads. It supports text and image inpu…

400K

$0.75

$4.5

Mar 2026

GPT-5.4 Nano

GPT-5.4 nano is the most lightweight and cost-efficient variant of the GPT-5.4 family, optimized for speed-critical and high-volume tasks. It supports text and…

400K

$0.2

$1.25

Mar 2026

Mistral Small 4

Mistral Small 4 is the next major release in the Mistral Small family, unifying the capabilities of several flagship Mistral models into a single system. It co…

262K

$0.15

$0.6

Mar 2026

GLM 5 Turbo

GLM-5 Turbo is a new model from Z.ai designed for fast inference and strong performance in agent-driven environments such as OpenClaw scenarios. It is deeply o…

203K

$1.2

Mar 2026

Nemotron 3 Super (free) ·

NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B parameters for maximum compute efficiency and accuracy in complex multi-…

262K

Mar 2026

Qwen3.5-9B

Qwen3.5-9B is a multimodal foundation model from the Qwen3.5 family, designed to deliver strong reasoning, coding, and visual understanding in an efficient 9B-…

262K

$0.1

$0.15

Mar 2026

Seed-2.0-Lite

Seed-2.0-Lite is a versatile, cost‑efficient enterprise workhorse that delivers strong multimodal and agent capabilities while offering noticeably lower latenc…

262K

$0.25

Mar 2026

GPT-5.4

GPT-5.4 is OpenAI’s latest frontier model, unifying the Codex and GPT lines into a single system. It features a 1M+ token context window (922K input, 128K outp…

1.1M

$2.5

$15

Mar 2026

GPT-5.4 Pro

GPT-5.4 Pro is OpenAI's most advanced model, building on GPT-5.4's unified architecture with enhanced reasoning capabilities for complex, high-stakes tasks. It…

1.1M

$30

$180

Mar 2026

Mercury 2

Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion LLM (dLLM). Instead of generating tokens sequentially, Mercury 2 produces and r…

128K

$0.25

$0.75

Mar 2026

Gemini 3.1 Flash Lite Preview

Gemini 3.1 Flash Lite Preview is Google's high-efficiency model optimized for high-volume use cases. It outperforms Gemini 2.5 Flash Lite on overall quality an…

$0.25

$1.5

Mar 2026

GPT-5.3 Chat

GPT-5.3 Chat is an update to ChatGPT's most-used model that makes everyday conversations smoother, more useful, and more directly helpful. It delivers more acc…

128K

$1.75

$14

Mar 2026

Seed-2.0-Mini

Seed-2.0-mini targets latency-sensitive, high-concurrency, and cost-sensitive scenarios, emphasizing fast response and flexible inference deployment. It delive…

262K

$0.1

$0.4

Feb 2026

Nano Banana 2 (Gemini 3.1 Flash Image Preview)

Gemini 3.1 Flash Image Preview, a.k.a. "Nano Banana 2," is Google’s latest state of the art image generation and editing model, delivering Pro-level visual qua…

131K

$0.5

Feb 2026

Qwen3.5-Flash

The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-expert…

$0.07

$0.26

Feb 2026

Qwen3.5-35B-A3B

The Qwen3.5 Series 35B-A3B is a native vision-language model designed with a hybrid architecture that integrates linear attention mechanisms and a sparse mixtu…

262K

$0.14

Feb 2026

Qwen3.5-122B-A10B

The Qwen3.5 122B-A10B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-expe…

262K

$0.26

$2.08

Feb 2026

Qwen3.5-27B

The Qwen3.5 27B native vision-language Dense model incorporates a linear attention mechanism, delivering fast response times while balancing inference speed an…

262K

$0.26

$2.6

Feb 2026

LFM2-24B-A2B

LFM2-24B-A2B is the largest model in the LFM2 family of hybrid architectures designed for efficient on-device deployment. Built as a 24B parameter Mixture-of-E…

128K

$0.03

$0.12

Feb 2026

Aion-2.0

Aion-2.0 is a variant of DeepSeek V3.2 optimized for immersive roleplaying and storytelling. It is particularly strong at introducing tension, crises, and conf…

131K

$0.8

$1.6

Feb 2026

Gemini 3.1 Pro Preview

Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more effic…

$12

Feb 2026

Gemini 3.1 Pro Preview Custom Tools

Gemini 3.1 Pro Preview Custom Tools is a variant of Gemini 3.1 Pro that improves tool selection behavior by preventing overuse of a general bash tool when more…

$12

Feb 2026

Claude Sonnet 4.6

Sonnet 4.6 is Anthropic's most capable Sonnet-class model yet, with frontier performance across coding, agents, and professional work. It excels at iterative d…

$15

Feb 2026

Qwen3.5 Plus 2026-02-15

The Qwen3.5 native vision-language series Plus models are built on a hybrid architecture that integrates linear attention mechanisms with sparse mixture-of-exp…

$0.26

$1.56

Feb 2026

Qwen3.5 397B A17B

The Qwen3.5 series 397B-A17B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-…

262K

$0.39

$2.34

Feb 2026

MiniMax M2.5

MiniMax-M2.5 is a SOTA large language model designed for real-world productivity. Trained in a diverse range of complex real-world digital working environments…

205K

$0.15

$0.9

Feb 2026

GLM 5

GLM-5 is Z.ai’s flagship open-source foundation model engineered for complex systems design and long-horizon agent workflows. Built for expert developers, it d…

205K

$0.95

$2.55

Feb 2026

Qwen3 Max Thinking

Qwen3-Max-Thinking is the flagship reasoning model in the Qwen3 series, designed for high-stakes cognitive tasks that require deep, multi-step reasoning. By si…

262K

$0.78

$3.9

Feb 2026

Claude Opus 4.6

Opus 4.6 is Anthropic’s strongest model for coding and long-running professional tasks. It is built for agents that operate across entire workflows rather than…

$25

Feb 2026

GPT-5.3-Codex

GPT-5.3-Codex is OpenAI’s most advanced agentic coding model, combining the frontier software engineering performance of GPT-5.2-Codex with the broader reasoni…

400K

$1.75

$14

Feb 2026

Qwen3 Coder Next

Qwen3-Coder-Next is an open-weight causal language model optimized for coding agents and local development workflows. It uses a sparse MoE design with 80B tota…

262K

$0.11

$0.8

Feb 2026

Free Models Router ·

The simplest way to get free inference. openrouter/free is a router that selects free models at random from the models available on OpenRouter. The router smar…

200K

Feb 2026

Step 3.5 Flash

Step 3.5 Flash is StepFun's most capable open-source foundation model. Built on a sparse Mixture of Experts (MoE) architecture, it selectively activates only 1…

262K

$0.1

$0.3

Jan 2026

Solar Pro 3

Solar Pro 3 is Upstage's powerful Mixture-of-Experts (MoE) language model. With 102B total parameters and 12B active parameters per forward pass, it delivers e…

128K

$0.15

$0.6

Jan 2026

Kimi K2.5

Kimi K2.5 is Moonshot AI's native multimodal model, delivering state-of-the-art visual coding capability and a self-directed agent swarm paradigm. Built on Kim…

262K

$0.57

$2.85

Jan 2026

MiniMax M2-her

MiniMax M2-her is a dialogue-first large language model built for immersive roleplay, character-driven chat, and expressive multi-turn conversations. Designed…

66K

$0.3

$1.2

Jan 2026

Palmyra X5

Palmyra X5 is Writer's most advanced model, purpose-built for building and scaling AI agents across the enterprise. It delivers industry-leading speed and effi…

$0.6

Jan 2026

LFM2.5-1.2B-Instruct (free) ·

LFM2.5-1.2B-Instruct is a compact, high-performance instruction-tuned model built for fast on-device AI. It delivers strong chat quality in a 1.2B parameter fo…

33K

Jan 2026

LFM2.5-1.2B-Thinking (free) ·

LFM2.5-1.2B-Thinking is a lightweight reasoning-focused model optimized for agentic tasks, data extraction, and RAG—while still running comfortably on edge dev…

33K

Jan 2026

GLM 4.7 Flash

As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance and efficiency. It is further optimized for agentic coding use cases, st…

203K

$0.06

$0.4

Jan 2026

Seed 1.6 Flash

Seed 1.6 Flash is an ultra-fast multimodal deep thinking model by ByteDance Seed, supporting both text and visual understanding. It features a 256k context win…

262K

$0.08

$0.3

Dec 2025

Seed 1.6

Seed 1.6 is a general-purpose model released by the ByteDance Seed team. It incorporates multimodal capabilities and adaptive deep thinking with a 256K context…

262K

$0.25

Dec 2025

MiniMax M2.1

MiniMax-M2.1 is a lightweight, state-of-the-art large language model optimized for coding, agentic workflows, and modern application development. With only 10…

205K

$0.3

$1.2

Dec 2025

GLM 4.7

GLM-4.7 is Z.ai’s latest flagship model, featuring upgrades in two key areas: enhanced programming capabilities and more stable multi-step reasoning/execution.…

205K

$0.4

$1.75

Dec 2025

Gemini 3 Flash Preview

Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and coding assistance. It delivers near Pro…

$0.5

Dec 2025

Nemotron 3 Nano 30B A3B

NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and accuracy for developers to build specialized agentic AI system…

262K

$0.05

$0.2

Dec 2025

GPT-5.2 Chat

GPT-5.2 Chat (AKA Instant) is the fast, lightweight member of the 5.2 family, optimized for low-latency chat while retaining strong general intelligence. It us…

128K

$1.75

$14

Dec 2025

GPT-5.2

GPT-5.2 is the latest frontier-grade model in the GPT-5 series, offering stronger agentic and long context perfomance compared to GPT-5.1. It uses adaptive rea…

400K

$1.75

$14

Dec 2025

GPT-5.2 Pro

GPT-5.2 Pro is OpenAI’s most advanced model, offering major improvements in agentic coding and long context performance over GPT-5 Pro. It is optimized for com…

400K

$21

$168

Dec 2025

GPT-5.2-Codex

GPT-5.2-Codex is an upgraded version of GPT-5.1-Codex optimized for software engineering and coding workflows. It is designed for both interactive development…

400K

$1.75

$14

Dec 2025

Devstral 2 2512

Devstral 2 is a state-of-the-art open-source model by Mistral AI specializing in agentic coding. It is a 123B-parameter dense transformer model supporting a 25…

262K

$0.4

Dec 2025

GLM 4.6V

GLM-4.6V is a large multimodal model designed for high-fidelity visual understanding and long-context reasoning across images, documents, and mixed media. It s…

131K

$0.3

$0.9

Dec 2025

Relace Search

The relace-search model uses 4-12 `view_file` and `grep` tools in parallel to explore a codebase and return relevant files to the user request. In contrast to…

256K

Dec 2025

Body Builder (beta)

Transform your natural language requests into structured OpenRouter API request objects. Describe what you want to accomplish with AI models, and Body Builder…

128K

Dec 2025

Ministral 3 8B 2512

A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny language model with vision capabilities.

262K

$0.15

Dec 2025

Ministral 3 3B 2512

The smallest model in the Ministral 3 family, Ministral 3 3B is a powerful, efficient tiny language model with vision capabilities.

131K

$0.1

Dec 2025

Ministral 3 14B 2512

The largest model in the Ministral 3 family, Ministral 3 14B offers frontier capabilities and performance comparable to its larger Mistral Small 3.2 24B counte…

262K

$0.2

Dec 2025

Nova 2 Lite

Nova 2 Lite is a fast, cost-effective reasoning model for everyday workloads that can process text, images, and videos to generate text. Nova 2 Lite demonstrat…

$0.3

$2.5

Dec 2025

DeepSeek V3.2

DeepSeek-V3.2 is a large language model designed to harmonize high computational efficiency with strong reasoning and agentic tool-use performance. It introduc…

164K

$0.27

$0.4

Dec 2025

Trinity Mini

Trinity Mini is a 26B-parameter (3B active) sparse mixture-of-experts language model featuring 128 experts with 8 active per token. Engineered for efficient re…

131K

$0.05

$0.15

Dec 2025

Mistral Large 3 2512

Mistral Large 3 2512 is Mistral’s most capable model to date, featuring a sparse mixture-of-experts architecture with 41B active parameters (675B total), and r…

262K

$0.5

$1.5

Dec 2025

Claude Opus 4.5

Claude Opus 4.5 is Anthropic’s frontier reasoning model optimized for complex software engineering, agentic workflows, and long-horizon computer use. It offers…

200K

$25

Nov 2025

Olmo 3 32B Think

Olmo 3 32B Think is a large-scale, 32-billion-parameter model purpose-built for deep reasoning, complex logic chains and advanced instruction-following scenari…

66K

$0.15

$0.5

Nov 2025

Nano Banana Pro (Gemini 3 Pro Image Preview)

Nano Banana Pro is Google’s most advanced image-generation and editing model, built on Gemini 3 Pro. It extends the original Nano Banana with significantly imp…

66K

$12

Nov 2025

GPT-5.1-Codex-Max

GPT-5.1-Codex-Max is OpenAI’s latest agentic coding model, designed for long-running, high-context software development tasks. It is based on an updated versio…

400K

$1.25

$10

Nov 2025

GPT-5.1

GPT-5.1 is the latest frontier-grade model in the GPT-5 series, offering stronger general-purpose reasoning, improved instruction adherence, and a more natural…

400K

$1.25

$10

Nov 2025

Cogito v2.1 671B

Cogito v2.1 671B MoE represents one of the strongest open models globally, matching performance of frontier closed and open models. This model is trained using…

128K

$1.25

Nov 2025

GPT-5.1-Codex-Mini

GPT-5.1-Codex-Mini is a smaller and faster version of GPT-5.1-Codex

400K

$0.25

Nov 2025

GPT-5.1-Codex

GPT-5.1-Codex is a specialized version of GPT-5.1 optimized for software engineering and coding workflows. It is designed for both interactive development sess…

400K

$1.25

$10

Nov 2025

GPT-5.1 Chat

GPT-5.1 Chat (AKA Instant is the fast, lightweight member of the 5.1 family, optimized for low-latency chat while retaining strong general intelligence. It use…

128K

$1.25

$10

Nov 2025

Kimi K2 Thinking

Kimi K2 Thinking is Moonshot AI’s most advanced open reasoning model to date, extending the K2 series into agentic, long-horizon reasoning. Built on the trilli…

262K

$0.6

$2.5

Nov 2025

Nova Premier 1.0

Amazon Nova Premier is the most capable of Amazon’s multimodal models for complex reasoning tasks and for use as the best teacher for distilling custom models.

$2.5

$12.5

Oct 2025

Voxtral Small 24B 2507

Voxtral Small is an enhancement of Mistral Small 3, incorporating state-of-the-art audio input capabilities while retaining best-in-class text performance. It…

32K

$0.1

$0.3

Oct 2025

Sonar Pro Search

Exclusively available on the OpenRouter API, Sonar Pro's new Pro Search mode is Perplexity's most advanced agentic search system. It is designed for deeper rea…

200K

$15

Oct 2025

gpt-oss-safeguard-20b

gpt-oss-safeguard-20b is a safety reasoning model from OpenAI built upon gpt-oss-20b. This open-weight, 21B-parameter Mixture-of-Experts (MoE) model offers low…

131K

$0.08

$0.3

Oct 2025

Nemotron Nano 12B 2 VL (free) ·

NVIDIA Nemotron Nano 2 VL is a 12-billion-parameter open multimodal reasoning model designed for video understanding and document intelligence. It introduces a…

128K

Oct 2025

Qwen3 VL 32B Instruct

Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-precision understanding and reasoning across text, images, and video.…

131K

$0.1

$0.42

Oct 2025

MiniMax M2

MiniMax-M2 is a compact, high-efficiency large language model optimized for end-to-end coding and agentic workflows. With 10 billion activated parameters (230…

205K

$0.3

$1.2

Oct 2025

Granite 4.0 Micro

Granite-4.0-H-Micro is a 3B parameter from the Granite 4 family of models. These models are the latest in a series of models released by IBM. They are fine-tun…

131K

$0.02

$0.11

Oct 2025

Phi 4 Mini Instruct

Phi-4-mini-instruct is a lightweight open model built upon synthetic data and filtered publicly available websites - with a focus on high-quality, reasoning de…

131K

$0.08

$0.35

Oct 2025

Claude Haiku 4.5

Claude Haiku 4.5 is Anthropic’s fastest and most efficient model, delivering near-frontier intelligence at a fraction of the cost and latency of larger Claude…

200K

Oct 2025

Qwen3 VL 8B Instruct

Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for high-fidelity understanding and reasoning across text, images, a…

262K

$0.12

$0.46

Oct 2025

Qwen3 VL 8B Thinking

Qwen3-VL-8B-Thinking is the reasoning-optimized variant of the Qwen3-VL-8B multimodal model, designed for advanced visual and textual reasoning across complex…

131K

$0.12

$1.37

Oct 2025

Llama 3.3 Nemotron Super 49B V1.5

Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model derived from Meta’s Llama-3.3-70B-Instruct with a 128K context. It’s…

131K

$0.4

Oct 2025

Qwen3 VL 30B A3B Thinking

Qwen3-VL-30B-A3B-Thinking is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Thinking variant enhan…

262K

$0.13

$1.56

Oct 2025

Qwen3 VL 30B A3B Instruct

Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Instruct variant optim…

262K

$0.13

$0.52

Oct 2025

GPT Audio Mini

A cost-efficient version of GPT Audio. The new snapshot features an upgraded decoder for more natural sounding voices and maintains better voice consistency. I…

128K

$0.6

$2.4

Oct 2025

GPT-5 Pro

GPT-5 Pro is OpenAI’s most advanced model, offering major improvements in reasoning, code quality, and user experience. It is optimized for complex tasks that…

400K

$15

$120

Oct 2025

Nano Banana (Gemini 2.5 Flash Image)

Gemini 2.5 Flash Image, a.k.a. "Nano Banana," is now generally available. It is a state of the art image generation model with contextual understanding. It is…

33K

$0.3

$2.5

Oct 2025

GLM 4.6

Compared with GLM-4.5, this generation brings several key improvements: Longer context window: The context window has been expanded from 128K to 200K tokens, e…

205K

$0.5

Sep 2025

Claude Sonnet 4.5

Claude Sonnet 4.5 is Anthropic’s most advanced Sonnet model to date, optimized for real-world agents and coding workflows. It delivers state-of-the-art perform…

$15

Sep 2025

DeepSeek V3.2 Exp

DeepSeek-V3.2-Exp is an experimental large language model released by DeepSeek as an intermediate step between V3.1 and future architectures. It introduces Dee…

164K

$0.27

$0.41

Sep 2025

Cydonia 24B V4.1

Uncensored and creative writing model based on Mistral Small 3.2 24B with good recall, prompt adherence, and intelligence.

131K

$0.3

$0.5

Sep 2025

Relace Apply 3

Relace Apply 3 is a specialized code-patching LLM that merges AI-suggested edits straight into your source files. It can apply updates from GPT-4o, Claude, and…

256K

$0.85

$1.25

Sep 2025

Gemini 2.5 Flash Lite Preview 09-2025

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throug…

$0.1

$0.4

Sep 2025

Qwen3 Coder Plus

Qwen3 Coder Plus is Alibaba's proprietary version of the Open Source Qwen3 Coder 480B A35B. It is a powerful coding agent model specializing in autonomous prog…

$0.65

$3.25

Sep 2025

Qwen3 Max

Qwen3-Max is an updated release built on the Qwen3 series, offering major improvements in reasoning, instruction following, multilingual support, and long-tail…

262K

$0.78

$3.9

Sep 2025

Qwen3 VL 235B A22B Instruct

Qwen3-VL-235B-A22B Instruct is an open-weight multimodal model that unifies strong text generation with visual understanding across images and video. The Instr…

262K

$0.21

$1.9

Sep 2025

Qwen3 VL 235B A22B Thinking

Qwen3-VL-235B-A22B Thinking is a multimodal model that unifies strong text generation with visual understanding across images and video. The Thinking model is…

131K

$0.26

$2.6

Sep 2025

DeepSeek V3.1 Terminus

DeepSeek-V3.1 Terminus is an update to DeepSeek V3.1 that maintains the model's original capabilities while addressing issues reported by users, including lang…

164K

$0.27

Sep 2025

Qwen3 Coder Flash

Qwen3 Coder Flash is Alibaba's fast and cost efficient version of their proprietary Qwen3 Coder Plus. It is a powerful coding agent model specializing in auton…

$0.2

$0.98

Sep 2025

GPT-5 Codex

GPT-5-Codex is a specialized version of GPT-5 optimized for software engineering and coding workflows. It is designed for both interactive development sessions…

400K

$1.25

$10

Sep 2025

Qwen3 Next 80B A3B Thinking

Qwen3-Next-80B-A3B-Thinking is a reasoning-first chat model in the Qwen3-Next line that outputs structured “thinking” traces by default. It’s designed for hard…

262K

$0.1

$0.78

Sep 2025

Qwen3 Next 80B A3B Instruct

Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimized for fast, stable responses without “thinking” traces. It targ…

262K

$0.1

$1.1

Sep 2025

Gemini 3.6 Flash

NEW

Jul 2026

Gemini 3.6 Flash is a high-efficiency model from Google for coding, agentic workflows, and web and app development. It is designed to produce polished outputs…

1M · in $1.5 · out $7.5

Gemini 3.5 Flash Lite

NEW

Jul 2026

Gemini 3.5 Flash Lite is a high-efficiency model from Google with upgraded agentic capabilities. It is suited for subagents that execute focused tasks within c…

1M · in $0.3 · out $2.5

Laguna S 2.1 (free) ·

NEW

Jul 2026

Laguna S 2.1 is the latest coding agent model from Poolside. Laguna S 2.1 is a 118B total parameter model with 8B active parameters, scoring 70.2% on Terminal-…

262K · in - · out -

LongCat 2.0

NEW

Jul 2026

LongCat 2.0 is a sparse mixture-of-experts language model from Meituan, with 48B active parameters out of 1.6T total. It is suited for coding, repository-level…

1M · in $0.3 · out $1.2

Inkling

NEW

Jul 2026

Inkling is an open-weight multimodal mixture-of-experts model from Thinking Machines Lab, with 41B active parameters out of 975B total. It is designed for gene…

524K · in $1 · out $4.05

Auto Router (Beta)

NEW

Jul 2026

Auto Router (Beta) is a task-aware router from OpenRouter. It classifies each request, then routes it the most popular model for that task based on aggregate s…

2M · in - · out -

Kimi K3

NEW

Jul 2026

Kimi K3 is a 2.8T parameter open-weight multimodal reasoning model from Moonshot AI. It is suited for complex coding, knowledge work, and long-horizon agentic…

1M · in $3 · out $15

Muse Spark 1.1

NEW

Jul 2026

Muse Spark 1.1 is a multimodal reasoning model from Meta, built for agentic tasks. It accepts text, images, video, audio, and PDF documents and returns text, w…

1M · in $1.25 · out $4.25

KAT-Coder-Air V2.5 (free) ·

NEW

Jul 2026

KAT-Coder-Air V2.5 is a flagship-level Agentic Coding model that can directly hand over an entire issue or an entire business workflow to it, allowing it to au…

256K · in - · out -

KAT-Coder-Pro V2.5

NEW

Jul 2026

KAT-Coder-Pro V2.5 is a flagship-level Agentic Coding model that can directly hand over an entire issue or an entire business workflow to it, allowing it to au…

256K · in $0.74 · out $2.96

GPT-5.6 Luna Pro

NEW

Jul 2026

GPT-5.6 Luna Pro is the same underlying model as GPT-5.6 Luna, served with `reasoning.mode` set to `pro` for higher-quality responses on complex tasks. Learn m…

1.1M · in $1 · out $6

GPT-5.6 Sol Pro

NEW

Jul 2026

GPT-5.6 Sol Pro is the same underlying model as GPT-5.6 Sol, served with `reasoning.mode` set to `pro` for higher-quality responses on complex tasks. Learn mor…

1.1M · in $5 · out $30

GPT-5.6 Terra Pro

NEW

Jul 2026

GPT-5.6 Terra Pro is the same underlying model as GPT-5.6 Terra, served with `reasoning.mode` set to `pro` for higher-quality responses on complex tasks. Learn…

1.1M · in $2.5 · out $15

Grok 4.5

NEW

Jul 2026

Grok 4.5 is SpaceXAI's smartest model with frontier performance on coding, knowledge work, and STEM.

500K · in $2 · out $6

Grok Latest

NEW

Jul 2026

This model always redirects to the latest Grok model from xAI.

500K · in $2 · out $6

Aion-3.0

NEW

Jul 2026

Aion-3.0 is a multi-model roleplaying and storytelling system from AionLabs, built on the GLM family of models. It uses a collaborative generation process in w…

131K · in $3 · out $6

Aion-3.0-Mini

NEW

Jul 2026

Aion-3.0 Mini is a multi-model roleplaying and storytelling system from AionLabs, built on the DeepSeek family of models. It uses a collaborative generation pr…

131K · in $0.7 · out $1.4

Hy3

NEW

Jul 2026

Hy3 is a 295B-parameter Mixture-of-Experts model from Tencent (21B active, 192 experts with top-8 routing) built for reasoning, agentic workflows, and real-wor…

262K · in $0.14 · out $0.58

Laguna XS 2.1 (free) ·

NEW

Jul 2026

Laguna XS 2.1 is the latest coding agent model in the 33B-A3B category from Poolside and a step forward from their Laguna XS.2 model (released in April 2026).…

262K · in - · out -

Nano Banana 2 Lite (Gemini 3.1 Flash Lite Image)

NEW

Jun 2026

Nano Banana 2 Lite (Gemini 3.1 Flash Lite Image) is Google's fastest, most cost-efficient Gemini image model, built for high-velocity developer pipelines and r…

66K · in $0.25 · out $1.5

Claude Sonnet 5

NEW

Jun 2026

Sonnet 5 is Anthropic's most capable Sonnet-class model, with frontier performance across coding, agents, and professional work. It supports adaptive thinking…

1M · in $2 · out $10

GPT-5.6 Sol

NEW

Jun 2026

GPT-5.6 Sol is the flagship model in OpenAI's GPT-5.6 series. It is suited for complex reasoning, coding, and agentic workflows, and is particularly strong at…

1.1M · in $5 · out $30

GPT-5.6 Terra

NEW

Jun 2026

GPT-5.6 Terra is a balanced model in OpenAI's GPT-5.6 series, positioned between the flagship Sol tier and the cost-efficient Luna tier. It is suited for every…

1.1M · in $2.5 · out $15

GPT-5.6 Luna

NEW

Jun 2026

GPT-5.6 Luna is a fast, cost-efficient model in OpenAI's GPT-5.6 series. It is suited for high-volume, latency-sensitive tasks such as chat, classification, an…

1.1M · in $1 · out $6

Fugu Ultra

NEW

Jun 2026

Fugu Ultra is the higher-performance model in Sakana AI's Fugu family. Rather than a single monolithic model, Fugu is a learned multi-agent orchestration syste…

1M · in $5 · out $30

Nex-N2-Mini

NEW

Jun 2026

Nex-N2-Mini is an open-source agentic mixture-of-experts model from Nex AGI, the smaller sibling in the Nex-N2 series. It accepts text and image input and is b…

262K · in $0.03 · out $0.1

North Mini Code (free) ·

NEW

Jun 2026

North Mini Code is Cohere's first agentic coding model and the debut of its North family. A sparse mixture-of-experts model with 30B total parameters and 3B ac…

256K · in - · out -

GLM 5.2

NEW

Jun 2026

GLM 5.2 is a large-scale reasoning model from Z.ai. It supports text input and output with a 1M-token context window, and is suited for long-horizon agent work…

1M · in $0.8 · out $2.52

Fusion

NEW

Jun 2026

Fusion turns your prompt into a small multi-model deliberation. A panel of expert models (see below) analyzes your prompt in parallel with web search and web f…

1M · in - · out -

Claude Fable 5

NEW

Jun 2026

Claude Fable 5 is a Mythos-class model from Anthropic, built for autonomous knowledge work and coding. It supports text, image, and file inputs with text outpu…

1M · in $10 · out $50

Claude Fable Latest

NEW

Jun 2026

This model always redirects to the latest model in the Claude Fable family.

1M · in $10 · out $50

Nex-N2-Pro

NEW

Jun 2026

Nex-N2-Pro is an agentic mixture-of-experts model from Nex AGI, with 17B active parameters out of 397B total. Built on the Qwen3.5 architecture, it accepts tex…

262K · in $0.25 · out $1

Nemotron 3 Ultra

Jun 2026

NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model from NVIDIA, with 55B active parameters out of 550B total (MoE). Built on a hybri…

512K · in $0.6 · out $3.6

Nemotron 3.5 Content Safety (free) ·

Jun 2026

NVIDIA Nemotron 3.5 Content Safety is a compact 4B-parameter multimodal guardrail model from NVIDIA, fine-tuned from Google Gemma-3-4B. It moderates both input…

128K · in - · out -

Qwen3.7 Plus

Jun 2026

Qwen3.7-Plus is a cost-effective model in Alibaba's Qwen3.7 series. It supports text and image input with text output, building on the series' text capabilitie…

1M · in $0.32 · out $1.28

Kimi K2.7 Code

Jun 2026

MoonshotAI: Kimi K2.7 Code is a coding-focused model in Moonshot AI's Kimi K2 family, built to complete end-to-end programming tasks reliably over long context…

262K · in $0.82 · out $3.75

MiniMax M3

May 2026

MiniMax-M3 is a multimodal foundation model from MiniMax. It supports text, image, and video inputs with text output, a 1M-token context window, and is suited…

1M · in $0.3 · out $1.2

Claude Opus 4.8

May 2026

Claude Opus 4.8 is Anthropic's most capable generally available model in the Opus family. It supports text, image, and file inputs with text output, with reaso…

1M · in $5 · out $25

Nano Banana 2 (Gemini 3.1 Flash Image)

May 2026

Gemini 3.1 Flash Image, a.k.a. "Nano Banana 2," is Google’s latest state of the art image generation and editing model, delivering Pro-level visual quality at…

131K · in $0.5 · out $3

Step 3.7 Flash

May 2026

Step 3.7 Flash is StepFun's latest high-efficiency multimodal Mixture-of-Experts model. It pairs a 196B-parameter language backbone with a vision encoder for n…

262K · in $0.2 · out $1.15

Nano Banana Pro (Gemini 3 Pro Image)

May 2026

Nano Banana Pro is Google’s most advanced image-generation and editing model, built on Gemini 3 Pro. It extends the original Nano Banana with significantly imp…

131K · in $2 · out $12

Claude Opus 4.8 (Fast)

May 2026

Fast-mode variant of Opus 4.8 - identical capabilities with higher output speed at 2x pricing relative to regular Opus 4.8. Learn more in Anthropic's docs: htt…

1M · in $10 · out $50

Qwen3.7 Max

May 2026

Qwen3.7-Max is the flagship model in Alibaba's Qwen3.7 series. It supports text input and output and is designed for agent-centric workloads, with particular s…

1M · in $1.48 · out $4.43

Grok Build 0.1

May 2026

Grok Build 0.1 is xAI’s fast coding model trained specifically for agentic software engineering workflows. It supports text and image inputs with text output,…

256K · in $1 · out $2

Gemini 3.5 Flash

May 2026

Gemini 3.5 Flash is Google's high-efficiency multimodal model, bringing near-Pro level coding and reasoning at Flash-tier cost and speed. It is highly optimize…

1M · in $1.5 · out $9

Perceptron Mk1

May 2026

Perceptron Mk1 (Mark One) is Perceptron's highest-quality vision-language model for video and embodied reasoning.** It accepts image and video inputs paired wi…

33K · in $0.15 · out $1.5

Claude Opus 4.7 (Fast)

May 2026

Fast-mode variant of Opus 4.7 - identical capabilities with higher output speed at premium 6x pricing. Learn more in Anthropic's docs: https://platform.claude.…

1M · in $30 · out $150

Ring-2.6-1T

May 2026

Ring-2.6-1T is a 1T-parameter-scale thinking model with 63B active parameters, built for real-world agent workflows that require both strong capability and ope…

262K · in $0.08 · out $0.63

Gemini 3.1 Flash Lite

May 2026

Gemini 3.1 Flash Lite is Google’s GA high-efficiency multimodal model optimized for low-latency, high-volume workloads. It supports text, image, video, audio,…

1M · in $0.25 · out $1.5

GPT Chat Latest

May 2026

GPT Chat Latest

400K · in $5 · out $30

Mistral Medium 3.5

Apr 2026

Mistral Medium 3.5 is a dense 128B instruction-following model from Mistral AI. It supports text and image inputs with text output, and is designed for agentic…

262K · in $1.5 · out $7.5

Granite 4.1 8B

Apr 2026

Granite 4.1 8B is a dense, decoder-only 8-billion-parameter language model from IBM, part of the Granite 4.1 family. It supports a 131K-token context window an…

131K · in $0.05 · out $0.1

Laguna M.1

Apr 2026

Laguna M.1 is the flagship coding agent model from Poolside, optimized for complex software engineering tasks. Designed for agentic coding workflows, it suppor…

262K · in $0.2 · out $0.4

Nemotron 3 Nano Omni (free) ·

Apr 2026

NVIDIA Nemotron™ 3 Nano Omni is a 30B-A3B open multimodal model designed to function as a perception and context sub-agent in enterprise agent systems. It acce…

256K · in - · out -

Owl Alpha ·

Apr 2026

Owl Alpha is a high-performance foundation model designed for agentic workloads. Natively supports tool use, and long-context tasks, with strong performance in…

1M · in - · out -

Laguna XS.2

Apr 2026

Laguna XS.2 is the second-generation model in the XS size class from Poolside, their efficient coding agent series. It combines tool calling and reasoning capa…

262K · in $0.1 · out $0.2

Google Gemini Flash Latest

Apr 2026

This model always redirects to the latest model in the Google Gemini Flash family.

1M · in $1.5 · out $7.5

Google Gemini Pro Latest

Apr 2026

This model always redirects to the latest model in the Google Gemini Pro family.

1M · in $2 · out $12

Qwen3.6 27B

Apr 2026

Qwen3.6 27B is a dense 27-billion-parameter language model from the Qwen Team at Alibaba, released in April 2026. It features hybrid multimodal capabilities —…

262K · in $0.45 · out $2.7

Qwen3.6 Flash

Apr 2026

Qwen3.6 Flash is a fast, efficient language model from Alibaba's Qwen 3.6 series. It supports text, image, and video input with a 1M token context window. Tier…

1M · in $0.19 · out $1.13

Qwen3.6 35B A3B

Apr 2026

Qwen3.6-35B-A3B is an open-weight multimodal model from Alibaba Cloud with 35 billion total parameters and 3 billion active parameters per token. It uses a hyb…

262K · in $0.14 · out $1

Qwen3.6 Max Preview

Apr 2026

Qwen3.6-Max-Preview is a proprietary frontier model from Alibaba Cloud built on a sparse mixture-of-experts architecture with approximately 1 trillion total pa…

262K · in $1.04 · out $6.24

OpenAI GPT Latest

Apr 2026

This model always redirects to the latest model in the OpenAI GPT family.

1.1M · in $5 · out $30

MoonshotAI Kimi Latest

Apr 2026

This model always redirects to the latest model in the MoonshotAI Kimi family.

1M · in $3 · out $15

OpenAI GPT Mini Latest

Apr 2026

This model always redirects to the latest model in the OpenAI GPT Mini family.

400K · in $0.75 · out $4.5

Anthropic Claude Sonnet Latest

Apr 2026

This model always redirects to the latest model in the Anthropic Claude Sonnet family.

1M · in $2 · out $10

Anthropic Claude Haiku Latest

Apr 2026

This model always redirects to the latest model in the Anthropic Claude Haiku family.

200K · in $1 · out $5

Qwen3.5 Plus 2026-04-20

Apr 2026

Qwen3.5 Plus (April 2026) is a large-scale multimodal language model from Alibaba. It accepts text, image, and video input and produces text output, with a 1M…

1M · in $0.3 · out $1.8

DeepSeek V4 Pro

Apr 2026

DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total parameters and 49B activated parameters, supporting a 1M-token context…

1M · in $0.44 · out $0.87

DeepSeek V4 Flash

Apr 2026

DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-to…

1M · in $0.1 · out $0.2

GPT-5.5

Apr 2026

GPT-5.5 is OpenAI’s frontier model designed for complex professional workloads, building on GPT-5.4 with stronger reasoning, higher reliability, and improved t…

1.1M · in $5 · out $30

Ling-2.6-1T

Apr 2026

Ling-2.6-1T is an instant (instruct) model from inclusionAI and the company’s trillion-parameter flagship, designed for real-world agents that require fast exe…

262K · in $0.08 · out $0.63

GPT-5.5 Pro

Apr 2026

GPT-5.5 Pro is OpenAI’s high-capability model optimized for deep reasoning and accuracy on complex, high-stakes workloads. It features a 1M+ token context wind…

1.1M · in $30 · out $180

MiMo-V2.5

Apr 2026

MiMo-V2.5 is a native omnimodal model by Xiaomi. It delivers Pro-level agentic performance at roughly half the inference cost, while surpassing MiMo-V2-Omni in…

1.1M · in $0.14 · out $0.28

MiMo-V2.5-Pro

Apr 2026

MiMo-V2.5-Pro is Xiaomi’s flagship model, delivering strong performance in general agentic capabilities, complex software engineering, and long-horizon tasks,…

1.1M · in $0.44 · out $0.87

Hy3 preview

Apr 2026

Hy3 preview is a high-efficiency Mixture-of-Experts model from Tencent designed for agentic workflows and production use. It supports configurable reasoning le…

262K · in $0.06 · out $0.21

Ling-2.6-flash

Apr 2026

Ling-2.6-flash is an instant (instruct) model from inclusionAI with 104B total parameters and 7.4B active parameters, designed for real-world agents that requi…

262K · in $0.01 · out $0.03

GPT-5.4 Image 2

Apr 2026

GPT-5.4 Image 2 combines OpenAI's GPT-5.4 model with state-of-the-art image generation capabilities from GPT Image 2. It enables rich multimodal workflows, all…

272K · in $8 · out $15

Pareto Code Router

Apr 2026

The Pareto Router maintains a tiered shortlist of strong coding models, ranked by Artificial Analysis coding percentiles. Set min_coding_score between 0 and 1…

2M · in - · out -

Claude Opus Latest

Apr 2026

This model always redirects to the latest model in the Claude Opus family.

1M · in $5 · out $25

Kimi K2.6

Apr 2026

Kimi K2.6 is Moonshot AI's next-generation multimodal model, designed for long-horizon coding, coding-driven UI/UX generation, and multi-agent orchestration. I…

262K · in $0.68 · out $3.42

Grok 4.3

Apr 2026

Grok 4.3 is a reasoning model from xAI. It accepts text and image inputs with text output, and is suited for agentic workflows, instruction-following tasks, an…

1M · in $1.25 · out $2.5

Claude Opus 4.7

Apr 2026

Opus 4.7 is the next generation of Anthropic's Opus family, built for long-running, asynchronous agents. Building on the coding and agentic strengths of Opus 4…

1M · in $5 · out $25

GLM 5.1

Apr 2026

GLM-5.1 delivers a major leap in coding capability, with particularly significant gains in handling long-horizon tasks. Unlike previous models built around min…

205K · in $0.97 · out $3.04

Gemma 4 26B A4B

Apr 2026

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token du…

262K · in $0.07 · out $0.34

Gemma 4 31B

Apr 2026

Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output. Features a 256K token context window,…

262K · in $0.12 · out $0.37

Qwen3.6 Plus

Apr 2026

Qwen 3.6 Plus builds on a hybrid architecture that combines efficient linear attention with sparse mixture-of-experts routing, enabling strong scalability and…

1M · in $0.33 · out $1.95

Trinity Large Thinking

Apr 2026

Trinity Large Thinking is a powerful open source reasoning model from the team at Arcee AI. It shows strong performance in PinchBench, agentic workloads, and r…

262K · in $0.25 · out $0.8

GLM 5V Turbo

Apr 2026

GLM-5V-Turbo is Z.ai’s first native multimodal agent foundation model, built for vision-based coding and agent-driven tasks. It natively handles image, video,…

203K · in $1.2 · out $4

Grok 4.20

Mar 2026

Grok 4.20 is a reasoning model from xAI with industry-leading speed and agentic tool calling capabilities. It combines the lowest hallucination rate on the mar…

2M · in $1.25 · out $2.5

Grok 4.20 Multi-Agent

Mar 2026

Grok 4.20 Multi-Agent is a variant of xAI’s Grok 4.20 designed for collaborative, agent-based workflows. Multiple agents operate in parallel to conduct deep re…

2M · in $1.25 · out $2.5

Lyria 3 Pro Preview ·

Mar 2026

Full-length songs are priced at $0.08 per song. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, you can…

1M · in - · out -

Lyria 3 Clip Preview ·

Mar 2026

30 second duration clips are priced at $0.04 per clip. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, y…

1M · in - · out -

KAT-Coder-Pro V2

Mar 2026

KAT-Coder-Pro V2 is the latest high-performance model in KwaiKAT’s KAT-Coder series, designed for complex enterprise-grade software engineering and SaaS integr…

262K · in $0.3 · out $1.2

Reka Edge

Mar 2026

Reka Edge is an extremely efficient 7B multimodal vision-language model that accepts image/video+text inputs and generates text outputs. This model is optimize…

16K · in $0.1 · out $0.1

MiniMax M2.7

Mar 2026

MiniMax-M2.7 is a next-generation large language model designed for autonomous, real-world productivity and continuous improvement. Built to actively participa…

205K · in $0.25 · out $1

GPT-5.4 Mini

Mar 2026

GPT-5.4 mini brings the core capabilities of GPT-5.4 to a faster, more efficient model optimized for high-throughput workloads. It supports text and image inpu…

400K · in $0.75 · out $4.5

GPT-5.4 Nano

Mar 2026

GPT-5.4 nano is the most lightweight and cost-efficient variant of the GPT-5.4 family, optimized for speed-critical and high-volume tasks. It supports text and…

400K · in $0.2 · out $1.25

Mistral Small 4

Mar 2026

Mistral Small 4 is the next major release in the Mistral Small family, unifying the capabilities of several flagship Mistral models into a single system. It co…

262K · in $0.15 · out $0.6

GLM 5 Turbo

Mar 2026

GLM-5 Turbo is a new model from Z.ai designed for fast inference and strong performance in agent-driven environments such as OpenClaw scenarios. It is deeply o…

203K · in $1.2 · out $4

Nemotron 3 Super (free) ·

Mar 2026

NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B parameters for maximum compute efficiency and accuracy in complex multi-…

262K · in - · out -

Qwen3.5-9B

Mar 2026

Qwen3.5-9B is a multimodal foundation model from the Qwen3.5 family, designed to deliver strong reasoning, coding, and visual understanding in an efficient 9B-…

262K · in $0.1 · out $0.15

Seed-2.0-Lite

Mar 2026

Seed-2.0-Lite is a versatile, cost‑efficient enterprise workhorse that delivers strong multimodal and agent capabilities while offering noticeably lower latenc…

262K · in $0.25 · out $2

GPT-5.4

Mar 2026

GPT-5.4 is OpenAI’s latest frontier model, unifying the Codex and GPT lines into a single system. It features a 1M+ token context window (922K input, 128K outp…

1.1M · in $2.5 · out $15

GPT-5.4 Pro

Mar 2026

GPT-5.4 Pro is OpenAI's most advanced model, building on GPT-5.4's unified architecture with enhanced reasoning capabilities for complex, high-stakes tasks. It…

1.1M · in $30 · out $180

Mercury 2

Mar 2026

Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion LLM (dLLM). Instead of generating tokens sequentially, Mercury 2 produces and r…

128K · in $0.25 · out $0.75

Gemini 3.1 Flash Lite Preview

Mar 2026

Gemini 3.1 Flash Lite Preview is Google's high-efficiency model optimized for high-volume use cases. It outperforms Gemini 2.5 Flash Lite on overall quality an…

1M · in $0.25 · out $1.5

GPT-5.3 Chat

Mar 2026

GPT-5.3 Chat is an update to ChatGPT's most-used model that makes everyday conversations smoother, more useful, and more directly helpful. It delivers more acc…

128K · in $1.75 · out $14

Seed-2.0-Mini

Feb 2026

Seed-2.0-mini targets latency-sensitive, high-concurrency, and cost-sensitive scenarios, emphasizing fast response and flexible inference deployment. It delive…

262K · in $0.1 · out $0.4

Nano Banana 2 (Gemini 3.1 Flash Image Preview)

Feb 2026

Gemini 3.1 Flash Image Preview, a.k.a. "Nano Banana 2," is Google’s latest state of the art image generation and editing model, delivering Pro-level visual qua…

131K · in $0.5 · out $3

Qwen3.5-Flash

Feb 2026

The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-expert…

1M · in $0.07 · out $0.26

Qwen3.5-35B-A3B

Feb 2026

The Qwen3.5 Series 35B-A3B is a native vision-language model designed with a hybrid architecture that integrates linear attention mechanisms and a sparse mixtu…

262K · in $0.14 · out $1

Qwen3.5-122B-A10B

Feb 2026

The Qwen3.5 122B-A10B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-expe…

262K · in $0.26 · out $2.08

Qwen3.5-27B

Feb 2026

The Qwen3.5 27B native vision-language Dense model incorporates a linear attention mechanism, delivering fast response times while balancing inference speed an…

262K · in $0.26 · out $2.6

LFM2-24B-A2B

Feb 2026

LFM2-24B-A2B is the largest model in the LFM2 family of hybrid architectures designed for efficient on-device deployment. Built as a 24B parameter Mixture-of-E…

128K · in $0.03 · out $0.12

Aion-2.0

Feb 2026

Aion-2.0 is a variant of DeepSeek V3.2 optimized for immersive roleplaying and storytelling. It is particularly strong at introducing tension, crises, and conf…

131K · in $0.8 · out $1.6

Gemini 3.1 Pro Preview

Feb 2026

Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more effic…

1M · in $2 · out $12

Gemini 3.1 Pro Preview Custom Tools

Feb 2026

Gemini 3.1 Pro Preview Custom Tools is a variant of Gemini 3.1 Pro that improves tool selection behavior by preventing overuse of a general bash tool when more…

1M · in $2 · out $12

Claude Sonnet 4.6

Feb 2026

Sonnet 4.6 is Anthropic's most capable Sonnet-class model yet, with frontier performance across coding, agents, and professional work. It excels at iterative d…

1M · in $3 · out $15

Qwen3.5 Plus 2026-02-15

Feb 2026

The Qwen3.5 native vision-language series Plus models are built on a hybrid architecture that integrates linear attention mechanisms with sparse mixture-of-exp…

1M · in $0.26 · out $1.56

Qwen3.5 397B A17B

Feb 2026

The Qwen3.5 series 397B-A17B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-…

262K · in $0.39 · out $2.34

MiniMax M2.5

Feb 2026

MiniMax-M2.5 is a SOTA large language model designed for real-world productivity. Trained in a diverse range of complex real-world digital working environments…

205K · in $0.15 · out $0.9

GLM 5

Feb 2026

GLM-5 is Z.ai’s flagship open-source foundation model engineered for complex systems design and long-horizon agent workflows. Built for expert developers, it d…

205K · in $0.95 · out $2.55

Qwen3 Max Thinking

Feb 2026

Qwen3-Max-Thinking is the flagship reasoning model in the Qwen3 series, designed for high-stakes cognitive tasks that require deep, multi-step reasoning. By si…

262K · in $0.78 · out $3.9

Claude Opus 4.6

Feb 2026

Opus 4.6 is Anthropic’s strongest model for coding and long-running professional tasks. It is built for agents that operate across entire workflows rather than…

1M · in $5 · out $25

GPT-5.3-Codex

Feb 2026

GPT-5.3-Codex is OpenAI’s most advanced agentic coding model, combining the frontier software engineering performance of GPT-5.2-Codex with the broader reasoni…

400K · in $1.75 · out $14

Qwen3 Coder Next

Feb 2026

Qwen3-Coder-Next is an open-weight causal language model optimized for coding agents and local development workflows. It uses a sparse MoE design with 80B tota…

262K · in $0.11 · out $0.8

Free Models Router ·

Feb 2026

The simplest way to get free inference. openrouter/free is a router that selects free models at random from the models available on OpenRouter. The router smar…

200K · in - · out -

Step 3.5 Flash

Jan 2026

Step 3.5 Flash is StepFun's most capable open-source foundation model. Built on a sparse Mixture of Experts (MoE) architecture, it selectively activates only 1…

262K · in $0.1 · out $0.3

Solar Pro 3

Jan 2026

Solar Pro 3 is Upstage's powerful Mixture-of-Experts (MoE) language model. With 102B total parameters and 12B active parameters per forward pass, it delivers e…

128K · in $0.15 · out $0.6

Kimi K2.5

Jan 2026

Kimi K2.5 is Moonshot AI's native multimodal model, delivering state-of-the-art visual coding capability and a self-directed agent swarm paradigm. Built on Kim…

262K · in $0.57 · out $2.85

MiniMax M2-her

Jan 2026

MiniMax M2-her is a dialogue-first large language model built for immersive roleplay, character-driven chat, and expressive multi-turn conversations. Designed…

66K · in $0.3 · out $1.2

Palmyra X5

Jan 2026

Palmyra X5 is Writer's most advanced model, purpose-built for building and scaling AI agents across the enterprise. It delivers industry-leading speed and effi…

1M · in $0.6 · out $6

LFM2.5-1.2B-Instruct (free) ·

Jan 2026

LFM2.5-1.2B-Instruct is a compact, high-performance instruction-tuned model built for fast on-device AI. It delivers strong chat quality in a 1.2B parameter fo…

33K · in - · out -

LFM2.5-1.2B-Thinking (free) ·

Jan 2026

LFM2.5-1.2B-Thinking is a lightweight reasoning-focused model optimized for agentic tasks, data extraction, and RAG—while still running comfortably on edge dev…

33K · in - · out -

GLM 4.7 Flash

Jan 2026

As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance and efficiency. It is further optimized for agentic coding use cases, st…

203K · in $0.06 · out $0.4

Seed 1.6 Flash

Dec 2025

Seed 1.6 Flash is an ultra-fast multimodal deep thinking model by ByteDance Seed, supporting both text and visual understanding. It features a 256k context win…

262K · in $0.08 · out $0.3

Seed 1.6

Dec 2025

Seed 1.6 is a general-purpose model released by the ByteDance Seed team. It incorporates multimodal capabilities and adaptive deep thinking with a 256K context…

262K · in $0.25 · out $2

MiniMax M2.1

Dec 2025

MiniMax-M2.1 is a lightweight, state-of-the-art large language model optimized for coding, agentic workflows, and modern application development. With only 10…

205K · in $0.3 · out $1.2

GLM 4.7

Dec 2025

GLM-4.7 is Z.ai’s latest flagship model, featuring upgrades in two key areas: enhanced programming capabilities and more stable multi-step reasoning/execution.…

205K · in $0.4 · out $1.75

Gemini 3 Flash Preview

Dec 2025

Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and coding assistance. It delivers near Pro…

1M · in $0.5 · out $3

Nemotron 3 Nano 30B A3B

Dec 2025

NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and accuracy for developers to build specialized agentic AI system…

262K · in $0.05 · out $0.2

GPT-5.2 Chat

Dec 2025

GPT-5.2 Chat (AKA Instant) is the fast, lightweight member of the 5.2 family, optimized for low-latency chat while retaining strong general intelligence. It us…

128K · in $1.75 · out $14

GPT-5.2

Dec 2025

GPT-5.2 is the latest frontier-grade model in the GPT-5 series, offering stronger agentic and long context perfomance compared to GPT-5.1. It uses adaptive rea…

400K · in $1.75 · out $14

GPT-5.2 Pro

Dec 2025

GPT-5.2 Pro is OpenAI’s most advanced model, offering major improvements in agentic coding and long context performance over GPT-5 Pro. It is optimized for com…

400K · in $21 · out $168

GPT-5.2-Codex

Dec 2025

GPT-5.2-Codex is an upgraded version of GPT-5.1-Codex optimized for software engineering and coding workflows. It is designed for both interactive development…

400K · in $1.75 · out $14

Devstral 2 2512

Dec 2025

Devstral 2 is a state-of-the-art open-source model by Mistral AI specializing in agentic coding. It is a 123B-parameter dense transformer model supporting a 25…

262K · in $0.4 · out $2

GLM 4.6V

Dec 2025

GLM-4.6V is a large multimodal model designed for high-fidelity visual understanding and long-context reasoning across images, documents, and mixed media. It s…

131K · in $0.3 · out $0.9

Relace Search

Dec 2025

The relace-search model uses 4-12 `view_file` and `grep` tools in parallel to explore a codebase and return relevant files to the user request. In contrast to…

256K · in $1 · out $3

Body Builder (beta)

Dec 2025

Transform your natural language requests into structured OpenRouter API request objects. Describe what you want to accomplish with AI models, and Body Builder…

128K · in - · out -

Ministral 3 8B 2512

Dec 2025

A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny language model with vision capabilities.

262K · in $0.15 · out $0.15

Ministral 3 3B 2512

Dec 2025

The smallest model in the Ministral 3 family, Ministral 3 3B is a powerful, efficient tiny language model with vision capabilities.

131K · in $0.1 · out $0.1

Ministral 3 14B 2512

Dec 2025

The largest model in the Ministral 3 family, Ministral 3 14B offers frontier capabilities and performance comparable to its larger Mistral Small 3.2 24B counte…

262K · in $0.2 · out $0.2

Nova 2 Lite

Dec 2025

Nova 2 Lite is a fast, cost-effective reasoning model for everyday workloads that can process text, images, and videos to generate text. Nova 2 Lite demonstrat…

1M · in $0.3 · out $2.5

DeepSeek V3.2

Dec 2025

DeepSeek-V3.2 is a large language model designed to harmonize high computational efficiency with strong reasoning and agentic tool-use performance. It introduc…

164K · in $0.27 · out $0.4

Trinity Mini

Dec 2025

Trinity Mini is a 26B-parameter (3B active) sparse mixture-of-experts language model featuring 128 experts with 8 active per token. Engineered for efficient re…

131K · in $0.05 · out $0.15

Mistral Large 3 2512

Dec 2025

Mistral Large 3 2512 is Mistral’s most capable model to date, featuring a sparse mixture-of-experts architecture with 41B active parameters (675B total), and r…

262K · in $0.5 · out $1.5

Claude Opus 4.5

Nov 2025

Claude Opus 4.5 is Anthropic’s frontier reasoning model optimized for complex software engineering, agentic workflows, and long-horizon computer use. It offers…

200K · in $5 · out $25

Olmo 3 32B Think

Nov 2025

Olmo 3 32B Think is a large-scale, 32-billion-parameter model purpose-built for deep reasoning, complex logic chains and advanced instruction-following scenari…

66K · in $0.15 · out $0.5

Nano Banana Pro (Gemini 3 Pro Image Preview)

Nov 2025

Nano Banana Pro is Google’s most advanced image-generation and editing model, built on Gemini 3 Pro. It extends the original Nano Banana with significantly imp…

66K · in $2 · out $12

GPT-5.1-Codex-Max

Nov 2025

GPT-5.1-Codex-Max is OpenAI’s latest agentic coding model, designed for long-running, high-context software development tasks. It is based on an updated versio…

400K · in $1.25 · out $10

GPT-5.1

Nov 2025

GPT-5.1 is the latest frontier-grade model in the GPT-5 series, offering stronger general-purpose reasoning, improved instruction adherence, and a more natural…

400K · in $1.25 · out $10

Cogito v2.1 671B

Nov 2025

Cogito v2.1 671B MoE represents one of the strongest open models globally, matching performance of frontier closed and open models. This model is trained using…

128K · in $1.25 · out $1.25

GPT-5.1-Codex-Mini

Nov 2025

GPT-5.1-Codex-Mini is a smaller and faster version of GPT-5.1-Codex

400K · in $0.25 · out $2

GPT-5.1-Codex

Nov 2025

GPT-5.1-Codex is a specialized version of GPT-5.1 optimized for software engineering and coding workflows. It is designed for both interactive development sess…

400K · in $1.25 · out $10

GPT-5.1 Chat

Nov 2025

GPT-5.1 Chat (AKA Instant is the fast, lightweight member of the 5.1 family, optimized for low-latency chat while retaining strong general intelligence. It use…

128K · in $1.25 · out $10

Kimi K2 Thinking

Nov 2025

Kimi K2 Thinking is Moonshot AI’s most advanced open reasoning model to date, extending the K2 series into agentic, long-horizon reasoning. Built on the trilli…

262K · in $0.6 · out $2.5

Nova Premier 1.0

Oct 2025

Amazon Nova Premier is the most capable of Amazon’s multimodal models for complex reasoning tasks and for use as the best teacher for distilling custom models.

1M · in $2.5 · out $12.5

Voxtral Small 24B 2507

Oct 2025

Voxtral Small is an enhancement of Mistral Small 3, incorporating state-of-the-art audio input capabilities while retaining best-in-class text performance. It…

32K · in $0.1 · out $0.3

Sonar Pro Search

Oct 2025

Exclusively available on the OpenRouter API, Sonar Pro's new Pro Search mode is Perplexity's most advanced agentic search system. It is designed for deeper rea…

200K · in $3 · out $15

gpt-oss-safeguard-20b

Oct 2025

gpt-oss-safeguard-20b is a safety reasoning model from OpenAI built upon gpt-oss-20b. This open-weight, 21B-parameter Mixture-of-Experts (MoE) model offers low…

131K · in $0.08 · out $0.3

Nemotron Nano 12B 2 VL (free) ·

Oct 2025

NVIDIA Nemotron Nano 2 VL is a 12-billion-parameter open multimodal reasoning model designed for video understanding and document intelligence. It introduces a…

128K · in - · out -

Qwen3 VL 32B Instruct

Oct 2025

Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-precision understanding and reasoning across text, images, and video.…

131K · in $0.1 · out $0.42

MiniMax M2

Oct 2025

MiniMax-M2 is a compact, high-efficiency large language model optimized for end-to-end coding and agentic workflows. With 10 billion activated parameters (230…

205K · in $0.3 · out $1.2

Granite 4.0 Micro

Oct 2025

Granite-4.0-H-Micro is a 3B parameter from the Granite 4 family of models. These models are the latest in a series of models released by IBM. They are fine-tun…

131K · in $0.02 · out $0.11

Phi 4 Mini Instruct

Oct 2025

Phi-4-mini-instruct is a lightweight open model built upon synthetic data and filtered publicly available websites - with a focus on high-quality, reasoning de…

131K · in $0.08 · out $0.35

Claude Haiku 4.5

Oct 2025

Claude Haiku 4.5 is Anthropic’s fastest and most efficient model, delivering near-frontier intelligence at a fraction of the cost and latency of larger Claude…

200K · in $1 · out $5

Qwen3 VL 8B Instruct

Oct 2025

Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for high-fidelity understanding and reasoning across text, images, a…

262K · in $0.12 · out $0.46

Qwen3 VL 8B Thinking

Oct 2025

Qwen3-VL-8B-Thinking is the reasoning-optimized variant of the Qwen3-VL-8B multimodal model, designed for advanced visual and textual reasoning across complex…

131K · in $0.12 · out $1.37

Llama 3.3 Nemotron Super 49B V1.5

Oct 2025

Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model derived from Meta’s Llama-3.3-70B-Instruct with a 128K context. It’s…

131K · in $0.4 · out $0.4

Qwen3 VL 30B A3B Thinking

Oct 2025

Qwen3-VL-30B-A3B-Thinking is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Thinking variant enhan…

262K · in $0.13 · out $1.56

Qwen3 VL 30B A3B Instruct

Oct 2025

Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Instruct variant optim…

262K · in $0.13 · out $0.52

GPT Audio Mini

Oct 2025

A cost-efficient version of GPT Audio. The new snapshot features an upgraded decoder for more natural sounding voices and maintains better voice consistency. I…

128K · in $0.6 · out $2.4

GPT-5 Pro

Oct 2025

GPT-5 Pro is OpenAI’s most advanced model, offering major improvements in reasoning, code quality, and user experience. It is optimized for complex tasks that…

400K · in $15 · out $120

Nano Banana (Gemini 2.5 Flash Image)

Oct 2025

Gemini 2.5 Flash Image, a.k.a. "Nano Banana," is now generally available. It is a state of the art image generation model with contextual understanding. It is…

33K · in $0.3 · out $2.5

GLM 4.6

Sep 2025

Compared with GLM-4.5, this generation brings several key improvements: Longer context window: The context window has been expanded from 128K to 200K tokens, e…

205K · in $0.5 · out $2

Claude Sonnet 4.5

Sep 2025

Claude Sonnet 4.5 is Anthropic’s most advanced Sonnet model to date, optimized for real-world agents and coding workflows. It delivers state-of-the-art perform…

1M · in $3 · out $15

DeepSeek V3.2 Exp

Sep 2025

DeepSeek-V3.2-Exp is an experimental large language model released by DeepSeek as an intermediate step between V3.1 and future architectures. It introduces Dee…

164K · in $0.27 · out $0.41

Cydonia 24B V4.1

Sep 2025

Uncensored and creative writing model based on Mistral Small 3.2 24B with good recall, prompt adherence, and intelligence.

131K · in $0.3 · out $0.5

Relace Apply 3

Sep 2025

Relace Apply 3 is a specialized code-patching LLM that merges AI-suggested edits straight into your source files. It can apply updates from GPT-4o, Claude, and…

256K · in $0.85 · out $1.25

Gemini 2.5 Flash Lite Preview 09-2025

Sep 2025

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throug…

1M · in $0.1 · out $0.4

Qwen3 Coder Plus

Sep 2025

Qwen3 Coder Plus is Alibaba's proprietary version of the Open Source Qwen3 Coder 480B A35B. It is a powerful coding agent model specializing in autonomous prog…

1M · in $0.65 · out $3.25

Qwen3 Max

Sep 2025

Qwen3-Max is an updated release built on the Qwen3 series, offering major improvements in reasoning, instruction following, multilingual support, and long-tail…

262K · in $0.78 · out $3.9

Qwen3 VL 235B A22B Instruct

Sep 2025

Qwen3-VL-235B-A22B Instruct is an open-weight multimodal model that unifies strong text generation with visual understanding across images and video. The Instr…

262K · in $0.21 · out $1.9

Qwen3 VL 235B A22B Thinking

Sep 2025

Qwen3-VL-235B-A22B Thinking is a multimodal model that unifies strong text generation with visual understanding across images and video. The Thinking model is…

131K · in $0.26 · out $2.6

DeepSeek V3.1 Terminus

Sep 2025

DeepSeek-V3.1 Terminus is an update to DeepSeek V3.1 that maintains the model's original capabilities while addressing issues reported by users, including lang…

164K · in $0.27 · out $1

Qwen3 Coder Flash

Sep 2025

Qwen3 Coder Flash is Alibaba's fast and cost efficient version of their proprietary Qwen3 Coder Plus. It is a powerful coding agent model specializing in auton…

1M · in $0.2 · out $0.98

GPT-5 Codex

Sep 2025

GPT-5-Codex is a specialized version of GPT-5 optimized for software engineering and coding workflows. It is designed for both interactive development sessions…

400K · in $1.25 · out $10

Qwen3 Next 80B A3B Thinking

Sep 2025

Qwen3-Next-80B-A3B-Thinking is a reasoning-first chat model in the Qwen3-Next line that outputs structured “thinking” traces by default. It’s designed for hard…

262K · in $0.1 · out $0.78

Qwen3 Next 80B A3B Instruct

Sep 2025

Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimized for fast, stable responses without “thinking” traces. It targ…

262K · in $0.1 · out $1.1

200 models · sorted by release date · prices in USD per 1M tokens · refreshed every 30 minutesCompare every model across vendors →

Get started in 3 steps

Create an API key at the OpenRouter console.

Paste it into Big-AGI's model settings.

Start chatting, or Beam it against other models and fuse the answers.

Running OpenRouter in Big-AGI

Add one OpenRouter API key and reach hundreds of models from dozens of providers through a single bridge. Big-AGI adds no markup and no intermediary: the billing relationship runs directly between you and OpenRouter. You bring the aggregator, Big-AGI brings the workspace.

Vendor-accurate controls, even through the router. Big-AGI recognizes anthropic/, google/, and openai/ model IDs on OpenRouter and applies its own detailed native definitions, correct thinking budgets and effort levels, instead of the aggregator's flattened metadata.
One-click key linking. An OAuth popup creates and captures your OpenRouter key, no copy-paste needed.
Full parameter control. Big-AGI exposes the parameters OpenRouter forwards: reasoning effort, verbosity, temperature, and top-p, at their extended ranges.
Fallback routing. When a provider degrades or a model goes unavailable, OpenRouter reroutes to a healthy backup, so your prompt still lands.

Your keys and your data

Turn on Direct Connection and the browser talks to OpenRouter directly, skipping the Big-AGI server, when your key is client-side and OpenRouter allows it. Your keys stay in your browser, not on Big-AGI's servers. Chats are stored locally first, and sync only if you turn it on. Nothing is added to your system prompt, and the AI Inspector shows the exact request, the token counts, and a cost estimate for every call.

OpenRouter in Beam

This is the part a router cannot do for you. Send one prompt to models from different providers at once, Anthropic next to OpenAI next to an open model OpenRouter hosts. Where they agree, you can trust the answer. Where they differ, you have caught something worth a second look. Fusions then combine, cross-check, and synthesize the parallel answers instead of just picking the best one. Big-AGI has shipped and refined this workflow since 2024. Parallel runs use more tokens than a single chat.

Big-AGI is an independent, open-source client. It is not affiliated with or endorsed by OpenRouter.

Bring your OpenRouter key. Keep control.

Your key, your data, your choice of model. Big-AGI is open source and self-hostable, so you can check exactly how OpenRouter is called.

Launch Big-AGI

<- All Models

Alibaba

Anthropic

AWS Bedrock

Azure

Cerebras

DeepSeek

Fireworks AI

Google Gemini

Groq

MiniMax

Mistral

Moonshot

OpenAI

OpenRouter

Perplexity

Sakana AI

SpaceXAI

Together AI

Z.ai

BIG-AGI

Product

Features Models Controls Changelog BEAM Technology

Resources

Documentation Discord GitHub

Company

Email Us Privacy Terms