The Big-AGI Model Index

Every AI model,
on one page.

Context window, input and output pricing, and capabilities for every model across every provider. Filter, sort, compare. Bring your own key, pay provider rates, no markup.

412 models¡12 vendors¡live, refreshed every 30 min
Available on
VendorsCapabilities

Nano Banana 2 Lite

NEW

Gemini 3.1 Flash Lite Image. (Version: 3.0, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,batchGenerateContent])

131K

$0.25

$1.5

VisionTools / functions

Jul 2026

Gemini Omni Flash Preview

NEW

Gemini Omni Flash Preview (Version: 001, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens])

197K

-

-

VisionTools / functions

Jul 2026

Gemma 4 31B (Preview)

NEW

Google Gemma 4 31B on Cerebras - first multimodal model on wafer-scale inference (~1,850 tok/s). Vision (base64 PNG/JPEG, max 5 images / 10MB), function calling, reasoning (off by default, enable via effort). 131K context (65K free tier), 40K max output.

131K

$0.99

$1.49

VisionReasoningTools / functions

Jun 2026

Claude Sonnet 5

NEW

Best combination of speed and intelligence, with the largest gains in coding and agentic tasks

1M

$2

$10

VisionTools / functionsWeb search

Jun 2026

Kimi K2.7 Code (Alibaba)

NEW

Moonshot Kimi K2.7 Code served via Alibaba Model Studio. Multimodal, always-on thinking, 256K context. (Alibaba pricing not yet published.)

262K

$0.95

$4

VisionReasoningTools / functions

Jun 2026

DeepSeek V4 Pro (Alibaba)

NEW

DeepSeek V4 Pro served via Alibaba Model Studio (Alibaba pricing, ~5x DeepSeek-direct). 1M context, thinking.

1M

$2.4

$4.8

ReasoningTools / functions

Jun 2026

DeepSeek V3.2 (Alibaba)

NEW

DeepSeek V3.2 served via Alibaba Model Studio (superseded by V4). Thinking.

131K

$0.57

$1.71

ReasoningTools / functions

Jun 2026

Sakana Fugu Ultra

NEW

Multi-agent conductor system routing 1-3 expert agents for complex, multi-step reasoning - maximum answer quality on hard tasks. 1M context.

1M

$5

$30

VisionReasoningTools / functionsWeb search

Jun 2026

Sakana Fugu

NEW

Fast orchestration model routing tasks across a swappable pool of frontier LLMs - low latency, high quality. 1M context. Billed at the routed underlying model's standard rate.

1M

-

-

VisionReasoningTools / functionsWeb search

Jun 2026

Qwen3.6 Flash

NEW

Fast, cost-effective multimodal model with 1M context, near-flagship quality, vision/video, and built-in tools.

1M

$0.25

$1.5

VisionReasoningTools / functions

Jun 2026

DeepSeek V4 Flash (Alibaba)

NEW

DeepSeek V4 Flash served via Alibaba Model Studio. 1M context, thinking.

1M

$0.2

$0.4

ReasoningTools / functions

Jun 2026

[?] Qwen3.7 Max [preview]

NEW

Flagship agent model with native extended thinking and 1M context. Text-only; strong at coding, productivity, and long-horizon autonomous tasks.

1M

$2.5

$7.5

ReasoningTools / functions

Jun 2026

[?] Qwen3.7 Max [2026 05 17]

NEW

Flagship agent model with native extended thinking and 1M context. Text-only; strong at coding, productivity, and long-horizon autonomous tasks.

1M

$2.5

$7.5

ReasoningTools / functions

Jun 2026

Cohere: North Mini Code (free) · 🎁

NEW

North Mini Code is Cohere's first agentic coding model and the debut of its North family. A sparse mixture-of-experts model with 30B total parameters and 3B active, it is optimized...

256K

-

-

ReasoningTools / functionsWeb search

Jun 2026

OpenRouter: Fusion

NEW

Fusion turns your prompt into a small multi-model deliberation. A panel of expert models (see below) analyzes your prompt in parallel with web search and web fetch enabled, then a...

1M

-

-

Web search

Jun 2026

GLM-5.2 (1M)

NEW

Z.ai 1M-context flagship (744B MoE, 40B activated). Agentic coding with reasoning_effort control (high/max). 1M context, 128K output.

1M

$1.4

$4.4

ReasoningTools / functions

Jun 2026

Claude Fable 5

NEW

Most capable widely released model for the most demanding reasoning and long-horizon agentic work

1M

$10

$50

VisionReasoningTools / functionsWeb search

Jun 2026

Anthropic: Claude Fable Latest

NEW

This model always redirects to the latest model in the Claude Fable family.

1M

$10

$50

VisionReasoningTools / functionsWeb search

Jun 2026

Nex AGI: Nex-N2-Pro

NEW

Nex-N2-Pro is an agentic mixture-of-experts model from Nex AGI, with 17B active parameters out of 397B total. Built on the Qwen3.5 architecture, it accepts text and image input and produces...

262K

$0.25

$1

VisionReasoningWeb search

Jun 2026

NVIDIA: Nemotron 3.5 Content Safety (free) · 🎁

NEW

NVIDIA Nemotron 3.5 Content Safety is a compact 4B-parameter multimodal guardrail model from NVIDIA, fine-tuned from Google Gemma-3-4B. It moderates both inputs to and responses from LLMs and VLMs, accepting...

128K

-

-

VisionReasoningWeb search

Jun 2026

NVIDIA: Nemotron 3 Ultra

NEW

NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model from NVIDIA, with 55B active parameters out of 550B total (MoE). Built on a hybrid Transformer-Mamba mixture-of-experts architecture, it...

1M

$0.5

$2.2

ReasoningTools / functionsWeb search

Jun 2026

Qwen3.7 Plus

NEW

Multimodal agent model with 1M context, native thinking, and vision/video understanding. Lower cost than Max.

1M

$0.4

$1.6

VisionReasoningTools / functions

Jun 2026

Kimi K2.7 Code Highspeed

NEW

High-speed code variant with ~180 tok/s output (up to 260 in short contexts). Native multimodal with always-on thinking. 256K context.

262K

$1.9

$8

VisionReasoningTools / functions

Jun 2026

MiniMax: MiniMax M3

NEW

MiniMax-M3 is a multimodal foundation model from MiniMax. It supports text, image, and video inputs with text output, a 1M-token context window, and is suited for long-horizon agentic work, coding,...

1M

$0.3

$1.2

VisionReasoningTools / functionsWeb search

May 2026

StepFun: Step 3.7 Flash

NEW

Step 3.7 Flash is StepFun's latest high-efficiency multimodal Mixture-of-Experts model. It pairs a 196B-parameter language backbone with a vision encoder for native image and video understanding, activating roughly 11B parameters...

256K

$0.2

$1.15

VisionReasoningTools / functionsWeb search

May 2026

Nano Banana Pro

NEW

Gemini 3 Pro Image (Version: 3.0, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,batchGenerateContent])

164K

$2

$12

VisionReasoningTools / functionsWeb searchImage output

May 2026

Nano Banana 2

NEW

Gemini 3.1 Flash Image. (Version: 3.0, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,batchGenerateContent])

131K

$0.5

$3

VisionReasoningTools / functionsWeb searchImage output

May 2026

Claude Opus 4.8

NEW

Most capable Opus-tier model for complex reasoning and agentic coding

1M

$5

$25

VisionTools / functionsWeb search

May 2026

Anthropic: Claude Opus 4.8 (Fast)

NEW

Fast-mode variant of [Opus 4.8](/anthropic/claude-opus-4.8) - identical capabilities with higher output speed at 2x pricing relative to regular Opus 4.8. Learn more in Anthropic's docs: https://platform.claude.com/docs/en/build-with-claude/fast-mode

1M

$10

$50

VisionTools / functionsWeb search

May 2026

xAI: Grok Build 0.1

NEW

Grok Build 0.1 is xAI’s fast coding model trained specifically for agentic software engineering workflows. It supports text and image inputs with text output, and is optimized for interactive coding...

256K

$1

$2

VisionReasoningTools / functionsWeb search

May 2026

Gemini 3.5 Flash

NEW

Gemini 3.5 Flash (Version: 3.5-flash-05-2026, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,createCachedContent,batchGenerateContent])

1.1M

$1.5

$9

VisionReasoningTools / functionsWeb search

May 2026

Antigravity Agent Preview (2026-05)

NEW

Preview release of Antigravity Agent (05-2026) (Version: 0.1, Defaults: temperature=undefined, topP=undefined, topK=undefined, interfaces=[generateContent,countTokens])

197K

$1.5

$9

VisionReasoning

May 2026

Qwen3 Coder Plus

Agentic coding model with very long context. Tiered pricing by input length (up to 1M).

1M

$1

$5

Tools / functions

May 2026

Perceptron: Perceptron Mk1

Perceptron Mk1 (Mark One) is Perceptron's highest-quality vision-language model for video and embodied reasoning.** It accepts image and video inputs paired with natural language queries, and produces detailed visual understanding...

33K

$0.15

$1.5

VisionReasoningWeb search

May 2026

Anthropic: Claude Opus 4.7 (Fast)

Fast-mode variant of [Opus 4.7](/anthropic/claude-opus-4.7) - identical capabilities with higher output speed at premium 6x pricing. Learn more in Anthropic's docs: https://platform.claude.com/docs/en/build-with-claude/fast-mode

1M

$30

$150

VisionTools / functionsWeb search

May 2026

Qwen3.6 27b

Alibaba model (not yet curated).

131K

$0.6

$3

Tools / functions

May 2026

inclusionAI: Ring-2.6-1T

Ring-2.6-1T is a 1T-parameter-scale thinking model with 63B active parameters, built for real-world agent workflows that require both strong capability and operational efficiency. It is optimized for coding agents, tool...

262K

$0.08

$0.63

ReasoningTools / functionsWeb search

May 2026

Gemini 3.1 Flash-Lite

Gemini 3.1 Flash Lite (Version: 3.1-flash-lite-05-2026, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,createCachedContent,batchGenerateContent])

1.1M

$0.25

$1.5

VisionReasoningTools / functionsWeb search

May 2026

OpenAI: GPT Chat Latest

GPT Chat Latest

400K

$5

$30

VisionTools / functionsWeb search

May 2026

xAI: Grok 4.3

Grok 4.3 is a reasoning model from xAI. It accepts text and image inputs with text output, and is suited for agentic workflows, instruction-following tasks, and applications requiring high factual...

1M

$1.25

$2.5

VisionReasoningTools / functionsWeb search

Apr 2026

mistral-medium-3.5

Official mistral-medium-latest Mistral AI model

262K

$1.5

$7.5

VisionTools / functions

Apr 2026

IBM: Granite 4.1 8B

Granite 4.1 8B is a dense, decoder-only 8-billion-parameter language model from IBM, part of the Granite 4.1 family. It supports a 131K-token context window and is designed for enterprise tasks...

131K

$0.05

$0.1

Tools / functionsWeb search

Apr 2026

[?] Qwen3 VL Plus [2025 12 19]

Current vision-language model with strong visual reasoning and thinking. Tiered pricing by input length (up to 256K).

262K

$0.2

$1.6

VisionReasoningTools / functions

Apr 2026

Poolside: Laguna XS.2 (free) · 🎁

Laguna XS.2 is the second-generation model in the XS size class from [Poolside](https://poolside.ai/), their efficient coding agent series. It combines tool calling and reasoning capabilities with a compact footprint, offering...

262K

-

-

ReasoningTools / functionsWeb search

Apr 2026

Poolside: Laguna M.1

Laguna M.1 is the flagship coding agent model from [Poolside](https://poolside.ai/), optimized for complex software engineering tasks. Designed for agentic coding workflows, it supports tool calling and reasoning, with a 256K...

262K

$0.2

$0.4

ReasoningTools / functionsWeb search

Apr 2026

Owl Alpha · 🎁

Owl Alpha is a high-performance foundation model designed for agentic workloads. Natively supports tool use, and long-context tasks, with strong performance in code generation, automated workflows, and complex instruction execution....

1M

-

-

Tools / functionsWeb search

Apr 2026

NVIDIA: Nemotron 3 Nano Omni (free) · 🎁

NVIDIA Nemotron™ 3 Nano Omni is a 30B-A3B open multimodal model designed to function as a perception and context sub-agent in enterprise agent systems. It accepts text, image, video, and...

256K

-

-

VisionReasoningTools / functionsWeb search

Apr 2026

mistral-medium-latest

Official mistral-medium-latest Mistral AI model

262K

$1.5

$7.5

VisionTools / functions

Apr 2026

Mistral Medium (latest)

Official mistral-medium-latest Mistral AI model

262K

$1.5

$7.5

VisionTools / functions

Apr 2026

Qwen3.6 Max Preview

Alibaba model (not yet curated).

131K

$1.04

$6.24

Tools / functions

Apr 2026

Qwen3.6 35b A3b

Alibaba model (not yet curated).

131K

$0.14

$1

Tools / functions

Apr 2026

Qwen3.5 Plus 2026 02 15

Alibaba model (not yet curated).

131K

$0.3

$1.8

Tools / functions

Apr 2026

OpenAI GPT Mini Latest

This model always redirects to the latest model in the OpenAI GPT Mini family.

400K

$0.75

$4.5

VisionReasoningTools / functionsWeb search

Apr 2026

OpenAI GPT Latest

This model always redirects to the latest model in the OpenAI GPT family.

1.1M

$5

$30

VisionReasoningTools / functionsWeb search

Apr 2026

MoonshotAI Kimi Latest

This model always redirects to the latest model in the MoonshotAI Kimi family.

262K

$0.55

$3.2

VisionReasoningTools / functionsWeb search

Apr 2026

Google Gemini Pro Latest

This model always redirects to the latest model in the Google Gemini Pro family.

1M

$2

$12

VisionReasoningTools / functionsWeb search

Apr 2026

Google Gemini Flash Latest

This model always redirects to the latest model in the Google Gemini Flash family.

1M

$1.5

$9

VisionReasoningTools / functionsWeb search

Apr 2026

Anthropic Claude Sonnet Latest

This model always redirects to the latest model in the Anthropic Claude Sonnet family.

1M

$2

$10

VisionReasoningTools / functionsWeb search

Apr 2026

Anthropic Claude Haiku Latest

This model always redirects to the latest model in the Anthropic Claude Haiku family.

200K

$1

$5

VisionReasoningTools / functionsWeb search

Apr 2026

inclusionAI: Ling-2.6-1T

Ling-2.6-1T is an instant (instruct) model from inclusionAI and the company’s trillion-parameter flagship, designed for real-world agents that require fast execution and high efficiency at scale. It uses a “fast...

262K

$0.08

$0.63

Tools / functionsWeb search

Apr 2026

GPT-5.5 Pro

Most capable model for complex tasks. Uses more compute for smarter, more precise responses on the hardest problems.

1.1M

$30

$180

VisionReasoningTools / functionsWeb searchImage output

Apr 2026

GPT-5.5

New baseline for complex production workflows. Stronger task execution, more precise tool use, more efficient reasoning with fewer tokens. 1M token context.

1.1M

$5

$30

VisionReasoningTools / functionsWeb searchImage output

Apr 2026

Xiaomi: MiMo-V2.5-Pro

MiMo-V2.5-Pro is Xiaomi’s flagship model, delivering strong performance in general agentic capabilities, complex software engineering, and long-horizon tasks, with top rankings on benchmarks such as ClawEval, GDPVal, and SWE-bench Pro....

1M

$0.44

$0.87

ReasoningTools / functionsWeb search

Apr 2026

Xiaomi: MiMo-V2.5

MiMo-V2.5 is a native omnimodal model by Xiaomi. It delivers Pro-level agentic performance at roughly half the inference cost, while surpassing MiMo-V2-Omni in multimodal perception across image and video understanding...

1M

$0.11

$0.28

VisionReasoningTools / functionsWeb search

Apr 2026

Tencent: Hy3 preview

Hy3 preview is a high-efficiency Mixture-of-Experts model from Tencent designed for agentic workflows and production use. It supports configurable reasoning levels across disabled, low, and high modes, allowing it to...

262K

$0.06

$0.21

ReasoningTools / functionsWeb search

Apr 2026

Pareto Code Router

The Pareto Router maintains a tiered shortlist of strong coding models, ranked by [Artificial Analysis](https://artificialanalysis.ai/) coding percentiles. Set min_coding_score between 0 and 1 on the [pareto-router plugin](https://openrouter.ai/docs/guides/routing/routers/pare...

2M

-

-

Web search

Apr 2026

OpenAI: GPT-5.4 Image 2

(https://openrouter.ai/openai/gpt-5.4) Image 2 combines OpenAI's GPT-5.4 model with state-of-the-art image generation capabilities from GPT Image 2. It enables rich multimodal workflows, allowing users to seamlessly move between reasoning, coding, and...

272K

$8

$15

VisionReasoningWeb searchImage output

Apr 2026

inclusionAI: Ling-2.6-flash

Ling-2.6-flash is an instant (instruct) model from inclusionAI with 104B total parameters and 7.4B active parameters, designed for real-world agents that require fast responses, strong execution, and high token efficiency....

262K

$0.01

$0.03

Tools / functionsWeb search

Apr 2026

Deep Research Preview (2026-04)

Preview release (April 21th, 2026) of Deep Research (Version: deepthink-exp-05-20, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens])

197K

$1.25

$10

VisionReasoning

Apr 2026

Deep Research Max Preview (2026-04)

Preview release (April 21st, 2026) of Deep Research Max (Version: deepthink-exp-05-20, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens])

197K

$1.25

$10

VisionReasoning

Apr 2026

Anthropic: Claude Opus Latest

This model always redirects to the latest model in the Claude Opus family.

1M

$5

$25

VisionReasoningTools / functionsWeb search

Apr 2026

Kimi K2.6

Native multimodal flagship (text, image, video inputs) with thinking and non-thinking modes. Stronger long-form coding, improved instruction compliance and self-correction. 256K context.

262K

$0.95

$4

VisionTools / functions

Apr 2026

Claude Opus 4.7

Previous most capable model for complex reasoning and agentic coding

1M

$5

$25

VisionTools / functionsWeb search

Apr 2026

Gemini 3.1 Flash TTS Preview

Gemini 3.1 Flash TTS Preview (Version: 3.1-flash-tts-preview, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,batchGenerateContent])

25K

$1

-

VisionAudio output

Apr 2026

Gemini Robotics-ER 1.6 Preview

Gemini Robotics-ER 1.6 Preview (Version: 1.6-preview, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,createCachedContent,batchGenerateContent])

197K

$1

$5

VisionReasoningTools / functions

Apr 2026

GLM-5.1

Z.ai flagship (744B MoE, 40B activated). Post-training upgrade over GLM-5 with stronger coding and long-horizon task autonomy. 200K context, thinking mode.

205K

$1.4

$4.4

ReasoningTools / functions

Apr 2026

Qwen3.6 Plus

Alibaba model (not yet curated).

131K

$0.33

$1.95

Tools / functions

Apr 2026

Gemma 4 31B IT

Gemma 4 31B IT (Version: 001, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens])

295K

-

-

Tools / functions

Apr 2026

Gemma 4 26B A4B IT

Gemma 4 26B A4B IT (Version: 001, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens])

295K

-

-

Tools / functions

Apr 2026

GLM-5V Turbo

First multimodal GLM-5 model. Vision-based coding agent with image/video/file inputs. 200K context, 128K output, thinking mode.

205K

$1.2

$4

VisionReasoningTools / functions

Apr 2026

Arcee AI: Trinity Large Thinking

Trinity Large Thinking is a powerful open source reasoning model from the team at Arcee AI. It shows strong performance in PinchBench, agentic workloads, and reasoning tasks. Launch video: https://youtu.be/Gc82AXLa0Rg?si=4RLn6WBz33qT--B7...

262K

$0.25

$0.8

ReasoningTools / functionsWeb search

Apr 2026

xAI: Grok 4.20 Multi-Agent

Grok 4.20 Multi-Agent is a variant of xAI’s Grok 4.20 designed for collaborative, agent-based workflows. Multiple agents operate in parallel to conduct deep research, coordinate tool use, and synthesize information...

2M

$1.25

$2.5

VisionReasoningWeb search

Mar 2026

xAI: Grok 4.20

Grok 4.20 is a reasoning model from xAI with industry-leading speed and agentic tool calling capabilities. It combines the lowest hallucination rate on the market with strict prompt adherance, delivering...

2M

$1.25

$2.5

VisionReasoningTools / functionsWeb search

Mar 2026

Google: Lyria 3 Pro Preview · 🎁

Full-length songs are priced at $0.08 per song. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, you can generate high-quality, 48kHz...

1M

-

-

VisionWeb searchAudio output

Mar 2026

Google: Lyria 3 Clip Preview · 🎁

30 second duration clips are priced at $0.04 per clip. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, you can generate...

1M

-

-

VisionWeb searchAudio output

Mar 2026

Kwaipilot: KAT-Coder-Pro V2

KAT-Coder-Pro V2 is the latest high-performance model in KwaiKAT’s KAT-Coder series, designed for complex enterprise-grade software engineering and SaaS integration. It builds on the agentic coding strengths of earlier versions,...

256K

$0.3

$1.2

Tools / functionsWeb search

Mar 2026

Reka Edge

Reka Edge is an extremely efficient 7B multimodal vision-language model that accepts image/video+text inputs and generates text outputs. This model is optimized specifically to deliver industry-leading performance in image understanding,...

16K

$0.1

$0.1

VisionTools / functionsWeb search

Mar 2026

MiniMax: MiniMax M2.7

MiniMax-M2.7 is a next-generation large language model designed for autonomous, real-world productivity and continuous improvement. Built to actively participate in its own evolution, M2.7 integrates advanced agentic capabilities through multi-agent...

205K

$0.18

$0.72

ReasoningTools / functionsWeb search

Mar 2026

GPT-5.4 Nano

Cheapest GPT-5.4-class model for simple high-volume tasks like classification and data extraction.

400K

$0.2

$1.25

VisionReasoningTools / functionsWeb searchImage output

Mar 2026

GPT-5.4 Mini

Strongest mini model for coding, computer use, and subagents. GPT-5.4-class intelligence at lower cost and latency.

400K

$0.75

$4.5

VisionReasoningTools / functionsWeb searchImage output

Mar 2026

mistral-small-latest

Mistral Small 4.

262K

$0.15

$0.6

VisionTools / functions

Mar 2026

Mistral Small (2603)

Mistral Small 4.

262K

$0.15

$0.6

VisionTools / functions

Mar 2026

Leanstral (2603)

A mid & post-trained version of mistral small 4 for Lean

197K

-

-

VisionTools / functions

Mar 2026

GLM-5 Turbo

Speed-optimized GLM-5 variant for agent workflows. Enhanced tool invocation and long-chain execution. 200K context, thinking mode.

205K

$1.2

$4

ReasoningTools / functions

Mar 2026

NVIDIA: Nemotron 3 Super (free) · 🎁

NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B parameters for maximum compute efficiency and accuracy in complex multi-agent applications. Built on a hybrid Mamba-Transformer...

1M

-

-

ReasoningTools / functionsWeb search

Mar 2026

Qwen: Qwen3.5-9B

Qwen3.5-9B is a multimodal foundation model from the Qwen3.5 family, designed to deliver strong reasoning, coding, and visual understanding in an efficient 9B-parameter architecture. It uses a unified vision-language design...

262K

$0.1

$0.15

VisionReasoningTools / functionsWeb search

Mar 2026

ByteDance Seed: Seed-2.0-Lite

Seed-2.0-Lite is a versatile, cost‑efficient enterprise workhorse that delivers strong multimodal and agent capabilities while offering noticeably lower latency, making it a practical default choice for most production workloads across...

262K

$0.25

$2

VisionReasoningTools / functionsWeb search

Mar 2026

GPT-5.4 Pro

Most capable model for complex tasks. Uses more compute for smarter, more precise responses on difficult problems.

1.1M

$30

$180

VisionReasoningTools / functionsWeb searchImage output

Mar 2026

GPT-5.4

Most capable and efficient frontier model for professional work. Native computer use, improved reasoning, coding, and agentic workflows with 1M token context.

1.1M

$2.5

$15

VisionReasoningTools / functionsWeb searchImage output

Mar 2026

Inception: Mercury 2

Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion LLM (dLLM). Instead of generating tokens sequentially, Mercury 2 produces and refines multiple tokens in parallel, achieving...

128K

$0.25

$0.75

ReasoningTools / functionsWeb search

Mar 2026

OpenAI: GPT-5.3 Chat

GPT-5.3 Chat is an update to ChatGPT's most-used model that makes everyday conversations smoother, more useful, and more directly helpful. It delivers more accurate answers with better contextualization and significantly...

128K

$1.75

$14

VisionTools / functionsWeb search

Mar 2026

GPT-5.3 Instant

deprecated

GPT-5.3 Instant model, previously powering ChatGPT. Replaced by GPT-5.5 Instant.

128K

$1.75

$14

VisionTools / functionsWeb searchImage output

Mar 2026

Gemini 3.1 Flash-Lite Preview

Gemini 3.1 Flash Lite Preview (Version: 3.1-flash-lite-preview-03-2026, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,createCachedContent,batchGenerateContent])

1.1M

$0.25

$1.5

VisionReasoningTools / functionsWeb search

Mar 2026

Nano Banana 2 Preview

Gemini 3.1 Flash Image Preview. (Version: 3.0, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,batchGenerateContent])

131K

$0.5

$3

VisionReasoningTools / functionsWeb searchImage output

Feb 2026

ByteDance Seed: Seed-2.0-Mini

Seed-2.0-mini targets latency-sensitive, high-concurrency, and cost-sensitive scenarios, emphasizing fast response and flexible inference deployment. It delivers performance comparable to ByteDance-Seed-1.6, supports 256k context, four reasoning effort modes (minimal/low/mediu...

262K

$0.1

$0.4

VisionReasoningTools / functionsWeb search

Feb 2026

Qwen3.5 35b A3b

Alibaba model (not yet curated).

131K

$0.14

$1

Tools / functions

Feb 2026

Qwen3.5 27b

Alibaba model (not yet curated).

131K

$0.2

$1.56

Tools / functions

Feb 2026

Qwen3.5 122b A10b

Alibaba model (not yet curated).

131K

$0.26

$2.08

Tools / functions

Feb 2026

Qwen: Qwen3.5-Flash

The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. Compared to the...

1M

$0.07

$0.26

VisionReasoningTools / functionsWeb search

Feb 2026

LiquidAI: LFM2-24B-A2B

LFM2-24B-A2B is the largest model in the LFM2 family of hybrid architectures designed for efficient on-device deployment. Built as a 24B parameter Mixture-of-Experts model with only 2B active parameters per...

128K

$0.03

$0.12

Web search

Feb 2026

GPT Audio 1.5

Best voice model for audio in, audio out with Chat Completions. Accepts audio inputs and outputs.

128K

$2.5

$10

Audio output

Feb 2026

AionLabs: Aion-2.0

Aion-2.0 is a variant of DeepSeek V3.2 optimized for immersive roleplaying and storytelling. It is particularly strong at introducing tension, crises, and conflict into stories, making narratives feel more engaging....

131K

$0.8

$1.6

ReasoningWeb search

Feb 2026

Gemini 3.1 Pro Preview (Custom Tools)

Gemini 3.1 Pro Preview optimized for custom tool usage (Version: 3.1-pro-preview-01-2026, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,createCachedContent,batchGenerateContent])

1.1M

$2

$12

VisionReasoningTools / functionsWeb search

Feb 2026

Gemini 3.1 Pro Preview

Gemini 3.1 Pro Preview (Version: 3.1-pro-preview-01-2026, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,createCachedContent,batchGenerateContent])

1.1M

$2

$12

VisionReasoningTools / functionsWeb search

Feb 2026

Claude Sonnet 4.6

Best combination of speed and intelligence for everyday tasks

1M

$3

$15

VisionTools / functionsWeb search

Feb 2026

Qwen3.5 397b A17b

Alibaba model (not yet curated).

131K

$0.39

$2.45

Tools / functions

Feb 2026

Qwen: Qwen3.5 Plus 2026-02-15

The Qwen3.5 native vision-language series Plus models are built on a hybrid architecture that integrates linear attention mechanisms with sparse mixture-of-experts models, achieving higher inference efficiency. In a variety of...

1M

$0.26

$1.56

VisionReasoningTools / functionsWeb search

Feb 2026

MiniMax: MiniMax M2.5

MiniMax-M2.5 is a SOTA large language model designed for real-world productivity. Trained in a diverse range of complex real-world digital working environments, M2.5 builds upon the coding expertise of M2.1...

205K

$0.12

$0.48

ReasoningTools / functionsWeb search

Feb 2026

GLM-5

Z.ai flagship foundation model (744B MoE, 40B activated). Designed for Agentic Engineering with SOTA coding and agent capabilities. 200K context, thinking mode.

205K

$1

$3.2

ReasoningTools / functions

Feb 2026

Qwen: Qwen3 Max Thinking

Qwen3-Max-Thinking is the flagship reasoning model in the Qwen3 series, designed for high-stakes cognitive tasks that require deep, multi-step reasoning. By significantly scaling model capacity and reinforcement learning compute, it...

262K

$0.78

$3.9

ReasoningTools / functionsWeb search

Feb 2026

GPT-5.3 Codex

Most capable agentic coding model. Combines frontier coding performance of GPT-5.2-Codex with reasoning and professional knowledge of GPT-5.2. ~25% faster.

400K

$1.75

$14

VisionReasoningTools / functionsWeb searchImage output

Feb 2026

Claude Opus 4.6

Previous most intelligent model for complex agents and coding, with adaptive thinking

1M

$5

$25

VisionTools / functionsWeb search

Feb 2026

Qwen3 Coder Next

Alibaba model (not yet curated).

131K

$0.11

$0.8

Tools / functions

Feb 2026

GLM-OCR (Vision, OCR)

Specialized OCR model for text extraction from images and documents.

131K

$0.03

$0.03

Vision

Feb 2026

Free Models Router · 🎁

The simplest way to get free inference. openrouter/free is a router that selects free models at random from the models available on OpenRouter. The router smartly filters for models that...

200K

-

-

VisionReasoningTools / functionsWeb search

Feb 2026

StepFun: Step 3.5 Flash

Step 3.5 Flash is StepFun's most capable open-source foundation model. Built on a sparse Mixture of Experts (MoE) architecture, it selectively activates only 11B of its 196B parameters per token....

262K

$0.1

$0.3

ReasoningTools / functionsWeb search

Jan 2026

Upstage: Solar Pro 3

Solar Pro 3 is Upstage's powerful Mixture-of-Experts (MoE) language model. With 102B total parameters and 12B active parameters per forward pass, it delivers exceptional performance while maintaining computational efficiency. Optimized...

128K

$0.15

$0.6

ReasoningTools / functionsWeb search

Jan 2026

Kimi K2.5

Supports vision (images/videos), thinking mode, and Agent tasks. 256K context.

262K

$0.6

$3

VisionTools / functions

Jan 2026

MiniMax: MiniMax M2-her

MiniMax M2-her is a dialogue-first large language model built for immersive roleplay, character-driven chat, and expressive multi-turn conversations. Designed to stay consistent in tone and personality, it supports rich message...

66K

$0.3

$1.2

Web search

Jan 2026

Writer: Palmyra X5

Palmyra X5 is Writer's most advanced model, purpose-built for building and scaling AI agents across the enterprise. It delivers industry-leading speed and efficiency on context windows up to 1 million...

1M

$0.6

$6

Web search

Jan 2026

LiquidAI: LFM2.5-1.2B-Thinking (free) · 🎁

LFM2.5-1.2B-Thinking is a lightweight reasoning-focused model optimized for agentic tasks, data extraction, and RAG—while still running comfortably on edge devices. It supports long context (up to 32K tokens) and is...

33K

-

-

ReasoningTools / functionsWeb search

Jan 2026

LiquidAI: LFM2.5-1.2B-Instruct (free) · 🎁

LFM2.5-1.2B-Instruct is a compact, high-performance instruction-tuned model built for fast on-device AI. It delivers strong chat quality in a 1.2B parameter footprint, with efficient edge inference and broad runtime support.

33K

-

-

Web search

Jan 2026

GLM-4.7 FlashX

Fast GLM-4.7 variant with priority routing and higher concurrency. Same model as Flash, better infrastructure.

131K

$0.07

$0.4

ReasoningTools / functions

Jan 2026

GLM-4.7 Flash (Free)

Free GLM-4.7 variant. Same model as FlashX but with limited concurrency (1 concurrent request) and lower priority.

131K

-

-

ReasoningTools / functions

Jan 2026

Z.ai GLM 4.7 (Preview)

Z.ai GLM 4.7 (355B) on Cerebras (~1,000 tok/s). Strong agentic coding, advanced reasoning (on by default), superior tool use. 131K context, 40K max output.

131K

$2.25

$2.75

ReasoningTools / functions

Jan 2026

MiniMax: MiniMax M2.1

MiniMax-M2.1 is a lightweight, state-of-the-art large language model optimized for coding, agentic workflows, and modern application development. With only 10 billion activated parameters, it delivers a major jump in real-world...

205K

$0.29

$0.95

ReasoningTools / functionsWeb search

Dec 2025

ByteDance Seed: Seed 1.6 Flash

Seed 1.6 Flash is an ultra-fast multimodal deep thinking model by ByteDance Seed, supporting both text and visual understanding. It features a 256k context window and can generate outputs of...

262K

$0.08

$0.3

VisionReasoningTools / functionsWeb search

Dec 2025

ByteDance Seed: Seed 1.6

Seed 1.6 is a general-purpose model released by the ByteDance Seed team. It incorporates multimodal capabilities and adaptive deep thinking with a 256K context window.

262K

$0.25

$2

VisionReasoningTools / functionsWeb search

Dec 2025

GLM-4.7

Latest-gen GLM model with 128K context. Thinking mode activated by default.

131K

$0.6

$2.2

ReasoningTools / functions

Dec 2025

Gemini 3 Flash Preview

Gemini 3 Flash Preview (Version: 3-flash-preview-12-2025, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,createCachedContent,batchGenerateContent])

1.1M

$0.5

$3

VisionReasoningTools / functionsWeb search

Dec 2025

GPT Audio Mini

Cost-efficient audio model. Accepts audio inputs and outputs via Chat Completions REST API.

128K

$0.6

$2.4

Audio output

Dec 2025

NVIDIA: Nemotron 3 Nano 30B A3B

NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and accuracy for developers to build specialized agentic AI systems. The model is fully...

262K

$0.05

$0.2

ReasoningTools / functionsWeb search

Dec 2025

OpenAI: GPT-5.2 Chat

GPT-5.2 Chat (AKA Instant) is the fast, lightweight member of the 5.2 family, optimized for low-latency chat while retaining strong general intelligence. It uses adaptive reasoning to selectively “think” on...

128K

$1.75

$14

VisionTools / functionsWeb search

Dec 2025

GPT-5.2 Pro

Smartest and most trustworthy option for difficult questions. Uses more compute for harder thinking on complex domains like programming.

400K

$21

$168

VisionReasoningTools / functionsWeb searchImage output

Dec 2025

GPT-5.2 Instant

deprecated

GPT-5.2 Instant model, previously powering ChatGPT. Replaced by GPT-5.5 Instant.

128K

$1.75

$14

VisionTools / functionsWeb searchImage output

Dec 2025

GPT-5.2 Codex

deprecated

GPT-5.2 optimized for long-horizon, agentic coding tasks in Codex or similar environments. Supports low, medium, high, and xhigh reasoning effort settings.

400K

$1.75

$14

VisionReasoningTools / functionsWeb searchImage output

Dec 2025

GPT-5.2

Most capable model for professional work and long-running agents. Improvements in general intelligence, long-context, agentic tool-calling, and vision.

400K

$1.75

$14

VisionReasoningTools / functionsWeb searchImage output

Dec 2025

Deep Research Pro Preview

Preview release (December 12th, 2025) of Deep Research Pro (Version: deepthink-exp-05-20, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens])

197K

$1.25

$10

VisionReasoning

Dec 2025

AutoGLM Phone

Mobile phone automation agent. Understands phone screens via multimodal perception and executes automated operations.

131K

-

-

Vision

Dec 2025

Devstral 2 (latest)

Official devstral-2512 Mistral AI model

262K

$0.4

$2

Tools / functions

Dec 2025

Devstral 2 (latest)

Official devstral-2512 Mistral AI model

262K

$0.4

$2

Tools / functions

Dec 2025

Devstral 2 (latest)

Official mistral-medium-latest Mistral AI model

262K

$0.4

$2

VisionTools / functions

Dec 2025

Devstral 2 (2512)

Official devstral-2512 Mistral AI model

262K

$0.4

$2

Tools / functions

Dec 2025

Relace: Relace Search

The relace-search model uses 4-12 `view_file` and `grep` tools in parallel to explore a codebase and return relevant files to the user request. In contrast to RAG, relace-search performs agentic...

256K

$1

$3

Tools / functionsWeb search

Dec 2025

GLM-4.6 V FlashX

Fast vision GLM-4.6 with priority routing and higher concurrency. Image/video/file inputs, 32K output.

131K

$0.04

$0.4

VisionReasoningTools / functions

Dec 2025

GLM-4.6 V Flash (Free)

Free vision GLM-4.6. Same model as FlashX but with limited concurrency (1 concurrent request). Image/video/file inputs, 32K output.

131K

-

-

VisionReasoningTools / functions

Dec 2025

GLM-4.6 V

Vision-enabled GLM-4.6 model. Supports image/video/file inputs, 32K output, hybrid thinking.

131K

$0.3

$0.9

VisionReasoningTools / functions

Dec 2025

Body Builder (beta)

Transform your natural language requests into structured OpenRouter API request objects. Describe what you want to accomplish with AI models, and Body Builder will construct the appropriate API calls. Example:...

128K

-

-

Web search

Dec 2025

mistral-large-latest

Official mistral-large-2512 Mistral AI model

262K

$0.5

$1.5

VisionTools / functions

Dec 2025

Mistral Large (2512)

Official mistral-large-2512 Mistral AI model

262K

$0.5

$1.5

VisionTools / functions

Dec 2025

ministral-8b-latest

Ministral 3 (a.k.a. Tinystral) 8B Instruct.

262K

$0.15

$0.15

VisionTools / functions

Dec 2025

ministral-3b-latest

Ministral 3 (a.k.a. Tinystral) 3B Instruct.

131K

$0.1

$0.1

VisionTools / functions

Dec 2025

ministral-14b-latest

Ministral 3 (a.k.a. Tinystral) 14B Instruct.

262K

$0.2

$0.2

VisionTools / functions

Dec 2025

Ministral 8b (2512)

Ministral 3 (a.k.a. Tinystral) 8B Instruct.

262K

$0.15

$0.15

VisionTools / functions

Dec 2025

Ministral 3b (2512)

Ministral 3 (a.k.a. Tinystral) 3B Instruct.

131K

$0.1

$0.1

VisionTools / functions

Dec 2025

Ministral 14b (2512)

Ministral 3 (a.k.a. Tinystral) 14B Instruct.

262K

$0.2

$0.2

VisionTools / functions

Dec 2025

Amazon: Nova 2 Lite

Nova 2 Lite is a fast, cost-effective reasoning model for everyday workloads that can process text, images, and videos to generate text. Nova 2 Lite demonstrates standout capabilities in processing...

1M

$0.3

$2.5

VisionReasoningTools / functionsWeb search

Dec 2025

Arcee AI: Trinity Mini

Trinity Mini is a 26B-parameter (3B active) sparse mixture-of-experts language model featuring 128 experts with 8 active per token. Engineered for efficient reasoning over long contexts (131k) with robust function...

131K

$0.05

$0.15

ReasoningTools / functionsWeb search

Dec 2025

Claude Opus 4.5

Previous most intelligent model with advanced reasoning for complex agentic workflows

200K

$5

$25

VisionTools / functionsWeb search

Nov 2025

AllenAI: Olmo 3 32B Think

Olmo 3 32B Think is a large-scale, 32-billion-parameter model purpose-built for deep reasoning, complex logic chains and advanced instruction-following scenarios. Its capacity enables strong performance on demanding evaluation tasks and...

66K

$0.15

$0.5

ReasoningWeb search

Nov 2025

Nano Banana Pro Preview

Gemini 3 Pro Image Preview (Version: 3.0, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,batchGenerateContent])

164K

$2

$12

VisionReasoningTools / functionsWeb searchImage output

Nov 2025

Nano Banana Pro

Gemini 3 Pro Image Preview (Version: 3.0, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,batchGenerateContent])

164K

$2

$12

VisionReasoningTools / functionsWeb searchImage output

Nov 2025

GPT-5.1 Codex Max

deprecated

Our most intelligent coding model optimized for long-horizon, agentic coding tasks.

400K

$1.25

$10

VisionReasoningTools / functionsWeb searchImage output

Nov 2025

GPT-5.1 Codex Mini

deprecated

Smaller, faster version of GPT-5.1 Codex for efficient coding tasks.

400K

$0.25

$2

VisionReasoningTools / functionsWeb searchImage output

Nov 2025

GPT-5.1 Codex

deprecated

A version of GPT-5.1 optimized for agentic coding tasks in Codex or similar environments.

400K

$1.25

$10

VisionReasoningTools / functionsWeb searchImage output

Nov 2025

GPT-5.1

The best model for coding and agentic tasks with configurable reasoning effort.

400K

$1.25

$10

VisionReasoningTools / functionsWeb searchImage output

Nov 2025

Deep Cogito: Cogito v2.1 671B

Cogito v2.1 671B MoE represents one of the strongest open models globally, matching performance of frontier closed and open models. This model is trained using self play with reinforcement learning...

128K

$1.25

$1.25

ReasoningWeb search

Nov 2025

OpenAI: GPT-5.1 Chat

GPT-5.1 Chat (AKA Instant is the fast, lightweight member of the 5.1 family, optimized for low-latency chat while retaining strong general intelligence. It uses adaptive reasoning to selectively “think” on...

128K

$1.25

$10

VisionReasoningTools / functionsWeb search

Nov 2025

GPT-5.1 Instant

deprecated

GPT-5.1 Instant with adaptive reasoning. More conversational with improved instruction following.

128K

$1.25

$10

VisionReasoningTools / functionsWeb searchImage output

Nov 2025

MoonshotAI: Kimi K2 Thinking

Kimi K2 Thinking is Moonshot AI’s most advanced open reasoning model to date, extending the K2 series into agentic, long-horizon reasoning. Built on the trillion-parameter Mixture-of-Experts (MoE) architecture introduced in...

262K

$0.6

$2.5

ReasoningTools / functionsWeb search

Nov 2025

Amazon: Nova Premier 1.0

Amazon Nova Premier is the most capable of Amazon’s multimodal models for complex reasoning tasks and for use as the best teacher for distilling custom models.

1M

$2.5

$12.5

VisionTools / functionsWeb search

Oct 2025

Perplexity: Sonar Pro Search

Exclusively available on the OpenRouter API, Sonar Pro's new Pro Search mode is Perplexity's most advanced agentic search system. It is designed for deeper reasoning and analysis. Pricing is based...

200K

$3

$15

VisionReasoningWeb search

Oct 2025

Mistral: Voxtral Small 24B 2507

Voxtral Small is an enhancement of Mistral Small 3, incorporating state-of-the-art audio input capabilities while retaining best-in-class text performance. It excels at speech transcription, translation and audio understanding. Input audio...

32K

$0.1

$0.3

Tools / functionsWeb search

Oct 2025

[OpenAI] GPT OSS Safeguard 20B (Preview)

OpenAI safety classification model (20B MoE). Purpose-built for content moderation with Harmony response format. 131K context, 65K max output. ~1000 t/s on Groq.

131K

$0.08

$0.3

Tools / functions

Oct 2025

NVIDIA: Nemotron Nano 12B 2 VL (free) · 🎁

NVIDIA Nemotron Nano 2 VL is a 12-billion-parameter open multimodal reasoning model designed for video understanding and document intelligence. It introduces a hybrid Transformer-Mamba architecture, combining transformer-level accuracy with Mamba’s...

128K

-

-

VisionReasoningTools / functionsWeb search

Oct 2025

Qwen: Qwen3 VL 32B Instruct

Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-precision understanding and reasoning across text, images, and video. With 32 billion parameters, it combines deep visual perception with advanced text...

262K

$0.1

$0.42

VisionTools / functionsWeb search

Oct 2025

MiniMax: MiniMax M2

MiniMax-M2 is a compact, high-efficiency large language model optimized for end-to-end coding and agentic workflows. With 10 billion activated parameters (230 billion total), it delivers near-frontier intelligence across general reasoning,...

205K

$0.26

$1

ReasoningTools / functionsWeb search

Oct 2025

IBM: Granite 4.0 Micro

Granite-4.0-H-Micro is a 3B parameter from the Granite 4 family of models. These models are the latest in a series of models released by IBM. They are fine-tuned for long...

131K

$0.02

$0.11

Web search

Oct 2025

Microsoft: Phi 4 Mini Instruct

Phi-4-mini-instruct is a lightweight open model built upon synthetic data and filtered publicly available websites - with a focus on high-quality, reasoning dense data. The model belongs to the Phi-4...

131K

$0.08

$0.35

Web search

Oct 2025

Claude Haiku 4.5

Fastest model with exceptional speed and performance

200K

$1

$5

VisionTools / functionsWeb search

Oct 2025

Qwen: Qwen3 VL 8B Thinking

Qwen3-VL-8B-Thinking is the reasoning-optimized variant of the Qwen3-VL-8B multimodal model, designed for advanced visual and textual reasoning across complex scenes, documents, and temporal sequences. It integrates enhanced multimodal alignment and...

256K

$0.12

$1.37

VisionReasoningTools / functionsWeb search

Oct 2025

Qwen: Qwen3 VL 8B Instruct

Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for high-fidelity understanding and reasoning across text, images, and video. It features improved multimodal fusion with Interleaved-MRoPE for long-horizon...

256K

$0.08

$0.5

VisionTools / functionsWeb search

Oct 2025

GPT-5 Search API

Updated web search model in Chat Completions API. 60% cheaper with domain filtering support.

400K

$1.25

$10

VisionWeb search

Oct 2025

NVIDIA: Llama 3.3 Nemotron Super 49B V1.5

Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model derived from Meta’s Llama-3.3-70B-Instruct with a 128K context. It’s post-trained for agentic workflows (RAG, tool calling) via SFT across math, code, science, and...

131K

$0.4

$0.4

ReasoningTools / functionsWeb search

Oct 2025

Gemini 2.5 Computer Use Preview 10-2025

Gemini 2.5 Computer Use Preview 10-2025 (Version: Gemini 2.5 Computer Use Preview 10-2025, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens])

197K

$1.25

$10

VisionReasoningTools / functions

Oct 2025

Qwen: Qwen3 VL 30B A3B Thinking

Qwen3-VL-30B-A3B-Thinking is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Thinking variant enhances reasoning in STEM, math, and complex tasks. It excels...

131K

$0.13

$1.56

VisionReasoningTools / functionsWeb search

Oct 2025

Qwen: Qwen3 VL 30B A3B Instruct

Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Instruct variant optimizes instruction-following for general multimodal tasks. It excels in perception...

262K

$0.13

$0.52

VisionTools / functionsWeb search

Oct 2025

GPT-5 Pro

Version of GPT-5 that uses more compute to produce smarter and more precise responses. Designed for tough problems.

400K

$15

$120

VisionReasoningTools / functionsWeb searchImage output

Oct 2025

Nano Banana

Gemini 2.5 Flash Preview Image (Version: 2.0, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,batchGenerateContent])

66K

$0.3

$2.5

VisionImage output

Oct 2025

GLM-4.6

GLM-4.6 model with 128K context/output. Hybrid thinking: auto-determines whether to engage deep reasoning.

131K

$0.6

$2.2

ReasoningTools / functions

Sep 2025

DeepSeek: DeepSeek V3.2 Exp

DeepSeek-V3.2-Exp is an experimental large language model released by DeepSeek as an intermediate step between V3.1 and future architectures. It introduces DeepSeek Sparse Attention (DSA), a fine-grained sparse attention mechanism...

164K

$0.27

$0.41

ReasoningTools / functionsWeb search

Sep 2025

Claude Sonnet 4.5

Previous best combination of speed and intelligence for complex agents and coding

200K

$3

$15

VisionTools / functionsWeb search

Sep 2025

TheDrummer: Cydonia 24B V4.1

Uncensored and creative writing model based on Mistral Small 3.2 24B with good recall, prompt adherence, and intelligence.

131K

$0.3

$0.5

Web search

Sep 2025

Relace: Relace Apply 3

Relace Apply 3 is a specialized code-patching LLM that merges AI-suggested edits straight into your source files. It can apply updates from GPT-4o, Claude, and others into your files at...

256K

$0.85

$1.25

Web search

Sep 2025

Google: Gemini 2.5 Flash Lite Preview 09-2025

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance...

1M

$0.1

$0.4

VisionReasoningTools / functionsWeb search

Sep 2025

Qwen3 Vl 235b A22b Thinking

Alibaba model (not yet curated).

131K

$0.26

$2.6

Tools / functions

Sep 2025

Qwen3 Vl 235b A22b Instruct

Alibaba model (not yet curated).

131K

$0.2

$0.88

Tools / functions

Sep 2025

Qwen3 Max

Alibaba model (not yet curated).

131K

$0.78

$3.9

Tools / functions

Sep 2025

DeepSeek: DeepSeek V3.1 Terminus

DeepSeek-V3.1 Terminus is an update to [DeepSeek V3.1](/deepseek/deepseek-chat-v3.1) that maintains the model's original capabilities while addressing issues reported by users, including language consistency and agent capabilities, further optimizing the model's...

164K

$0.27

$0.95

ReasoningTools / functionsWeb search

Sep 2025

Qwen3 Coder Flash

Alibaba model (not yet curated).

131K

$0.2

$0.98

Tools / functions

Sep 2025

magistral-small-latest

Mistral Small 4.

262K

$0.5

$1.5

VisionReasoningTools / functions

Sep 2025

magistral-medium-latest

Our frontier-class reasoning model release candidate September 2025.

131K

$2

$5

VisionReasoningTools / functions

Sep 2025

Magistral Small (2509)

Our efficient reasoning model released September 2025.

131K

$0.5

$1.5

VisionReasoningTools / functions

Sep 2025

Magistral Medium (2509)

Our frontier-class reasoning model release candidate September 2025.

131K

$2

$5

VisionReasoningTools / functions

Sep 2025

GPT-5 Codex

deprecated

A version of GPT-5 optimized for agentic coding in Codex.

400K

$1.25

$10

VisionReasoningTools / functionsWeb search

Sep 2025

Qwen3 Next 80b A3b Thinking

Alibaba model (not yet curated).

131K

$0.1

$0.78

Tools / functions

Sep 2025

Qwen3 Next 80b A3b Instruct

Alibaba model (not yet curated).

131K

$0.09

$1.1

Tools / functions

Sep 2025

Qwen Plus

Balanced quality, speed, and cost with hybrid thinking. 1M context.

1M

$0.4

$1.2

ReasoningTools / functions

Sep 2025

NVIDIA: Nemotron Nano 9B V2 (free) · 🎁

NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDIA, and designed as a unified model for both reasoning and non-reasoning tasks. It responds to user queries and...

128K

-

-

ReasoningTools / functionsWeb search

Sep 2025

MoonshotAI: Kimi K2 0905

Kimi K2 0905 is the September update of [Kimi K2 0711](moonshotai/kimi-k2). It is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion total parameters with 32...

262K

$0.6

$2.5

Tools / functionsWeb search

Sep 2025

[Groq] Compound Mini (Agentic System)

Lighter Groq agentic AI with web search, code execution. Pricing based on underlying model usage.

131K

-

-

Tools / functions

Sep 2025

[Groq] Compound (Agentic System)

Groq agentic AI with web search, code execution, browser automation. Uses GPT-OSS 120B, Llama 4 Scout, Llama 3.3 70B. Pricing based on underlying model usage.

131K

-

-

Tools / functions

Sep 2025

Qwen3 30b A3b Thinking 2507

Alibaba model (not yet curated).

131K

$0.08

$0.4

Tools / functions

Aug 2025

GPT Audio

First generally available audio model. Accepts audio inputs and outputs, and can be used in the Chat Completions REST API.

128K

$2.5

$10

Audio output

Aug 2025

Nous: Hermes 4 70B

Hermes 4 70B is a hybrid reasoning model from Nous Research, built on Meta-Llama-3.1-70B. It introduces the same hybrid mode as the larger 405B release, allowing the model to either...

131K

$0.13

$0.4

ReasoningWeb search

Aug 2025

Nous: Hermes 4 405B

Hermes 4 is a large-scale reasoning model built on Meta-Llama-3.1-405B and released by Nous Research. It introduces a hybrid reasoning mode, where the model can choose to deliberate internally with...

131K

$1

$3

ReasoningWeb search

Aug 2025

DeepSeek: DeepSeek V3.1

DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes via prompt templates. It extends the DeepSeek-V3 base with a two-phase long-context...

164K

$0.21

$0.79

ReasoningTools / functionsWeb search

Aug 2025

Mistral: Mistral Medium 3.1

Mistral Medium 3.1 is an updated version of Mistral Medium 3, which is a high-performance enterprise-grade language model designed to deliver frontier-level capabilities at significantly reduced operational cost. It balances...

131K

$0.4

$2

VisionTools / functionsWeb search

Aug 2025

Mistral Medium (2508)

Update on Mistral Medium 3 with improved capabilities.

131K

$0.4

$2

VisionTools / functions

Aug 2025

GLM-4.5 V

Vision-enabled GLM-4.5 model. 96K context, 16K output, interleaved thinking.

98K

$0.6

$1.8

VisionReasoningTools / functions

Aug 2025

AI21: Jamba Large 1.7

Jamba Large 1.7 is the latest model in the Jamba open family, offering improvements in grounding, instruction-following, and overall efficiency. Built on a hybrid SSM-Transformer architecture with a 256K context...

256K

$2

$8

Tools / functionsWeb search

Aug 2025

OpenAI: GPT-5 Chat

GPT-5 Chat is designed for advanced, natural, multimodal, and context-aware conversations for enterprise applications.

128K

$1.25

$10

VisionTools / functionsWeb search

Aug 2025

GPT-5 Nano

Fastest, most cost-efficient version of GPT-5 for summarization and classification tasks.

400K

$0.05

$0.4

VisionReasoningTools / functionsWeb searchImage output

Aug 2025

GPT-5 Mini

A faster, more cost-efficient version of GPT-5 for well-defined tasks.

400K

$0.25

$2

VisionReasoningTools / functionsWeb searchImage output

Aug 2025

GPT-5 ChatGPT

deprecated

GPT-5 model used in ChatGPT.

128K

$1.25

$10

VisionTools / functionsWeb searchImage output

Aug 2025

GPT-5

The best model for coding and agentic tasks across domains.

400K

$1.25

$10

VisionReasoningTools / functionsWeb searchImage output

Aug 2025

GPT OSS 120B

OpenAI flagship open-weight MoE (120B total, 5.1B active) on Cerebras (~3,000 tok/s). Reasoning (default medium effort) and function calling. 131K context, 40K max output.

131K

$0.35

$0.75

ReasoningTools / functions

Aug 2025

Claude Opus 4.1

deprecated

Previous Opus model. Deprecated June 5, 2026, retiring August 5, 2026.

200K

$15

$75

VisionTools / functionsWeb search

Aug 2025

[OpenAI] GPT OSS 20B

OpenAI efficient open-weight MoE (20B total, 3.6B active). Tool use, browser search, code execution. 131K context, 65K max output. ~1000 t/s on Groq.

131K

$0.08

$0.3

Tools / functions

Aug 2025

Qwen: Qwen3 Coder 30B A3B Instruct

Qwen3-Coder-30B-A3B-Instruct is a 30.5B parameter Mixture-of-Experts (MoE) model with 128 experts (8 active per forward pass), designed for advanced code generation, repository-scale understanding, and agentic tool use. Built on the...

160K

$0.07

$0.27

Tools / functionsWeb search

Jul 2025

codestral-latest

Our cutting-edge language model for coding released August 2025.

256K

$0.3

$0.9

Tools / functions

Jul 2025

Codestral (2508)

Our cutting-edge language model for coding released August 2025.

256K

$0.3

$0.9

Tools / functions

Jul 2025

Qwen3 30b A3b Instruct 2507

Alibaba model (not yet curated).

131K

$0.05

$0.19

Tools / functions

Jul 2025

GLM-4.5 X

Extended GLM-4.5 model. Interleaved thinking.

98K

$2.2

$8.9

ReasoningTools / functions

Jul 2025

GLM-4.5 Flash (Free)

Free GLM-4.5 variant with limited concurrency. Prior-gen, superseded by GLM-4.7 Flash.

98K

-

-

ReasoningTools / functions

Jul 2025

GLM-4.5 AirX

Extended lightweight GLM-4.5 variant. Interleaved thinking.

98K

$1.1

$4.5

ReasoningTools / functions

Jul 2025

GLM-4.5 Air

Lightweight GLM-4.5 variant. Interleaved thinking.

98K

$0.2

$1.1

ReasoningTools / functions

Jul 2025

GLM-4.5

Prior-gen GLM-4.5 model with 96K context/output. Interleaved thinking.

98K

$0.6

$2.2

ReasoningTools / functions

Jul 2025

Qwen3 235b A22b Thinking 2507

Alibaba model (not yet curated).

131K

$0.15

$1.5

Tools / functions

Jul 2025

Qwen: Qwen3 Coder 480B A35B

Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team. It is optimized for agentic coding tasks such as function calling, tool use, and long-context reasoning over...

1M

$0.22

$1.8

Tools / functionsWeb search

Jul 2025

Gemini 2.5 Flash-Lite

Stable version of Gemini 2.5 Flash-Lite, released in July of 2025 (Version: 001, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,createCachedContent,batchGenerateContent])

1.1M

$0.1

$0.4

VisionReasoningTools / functionsWeb search

Jul 2025

ByteDance: UI-TARS 7B

UI-TARS-1.5 is a multimodal vision-language agent optimized for GUI-based environments, including desktop interfaces, web browsers, mobile systems, and games. Built by ByteDance, it builds upon the UI-TARS framework with reinforcement...

128K

$0.1

$0.2

VisionWeb search

Jul 2025

Qwen: Qwen3 235B A22B Instruct 2507

Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-experts language model based on the Qwen3-235B architecture, with 22B active parameters per forward pass. It is optimized for general-purpose text generation, including instruction following,...

262K

$0.09

$0.1

Tools / functionsWeb search

Jul 2025

voxtral-small-latest

A small audio understanding model released in July 2025

33K

$0.1

$0.3

Tools / functions

Jul 2025

voxtral-mini-latest

A mini audio understanding model released in July 2025

33K

$0.04

$0.04

-

Jul 2025

Voxtral Small (2507)

A small audio understanding model released in July 2025

33K

$0.1

$0.3

Tools / functions

Jul 2025

Voxtral Mini (2507)

A mini audio understanding model released in July 2025

33K

$0.04

$0.04

-

Jul 2025

Switchpoint Router

Switchpoint AI's router instantly analyzes your request and directs it to the optimal AI from an ever-evolving library. As the world of LLMs advances, our router gets smarter, ensuring you...

131K

$0.85

$3.4

ReasoningWeb search

Jul 2025

MoonshotAI: Kimi K2 0711

Kimi K2 Instruct is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion total parameters with 32 billion active per forward pass. It is optimized for...

131K

$0.57

$2.3

Tools / functionsWeb search

Jul 2025

Venice: Uncensored (free) · 🎁

Venice Uncensored Dolphin Mistral 24B Venice Edition is a fine-tuned variant of Mistral-Small-24B-Instruct-2501, developed by dphn.ai in collaboration with Venice.ai. This model is designed as an “uncensored” instruct-tuned LLM, preserving...

33K

-

-

Web search

Jul 2025

Tencent: Hunyuan A13B Instruct

Hunyuan-A13B is a 13B active parameter Mixture-of-Experts (MoE) language model developed by Tencent, with a total parameter count of 80B and support for reasoning via Chain-of-Thought. It offers competitive benchmark...

131K

$0.14

$0.57

ReasoningWeb search

Jul 2025

Morph: Morph V3 Large

Morph's high-accuracy apply model for complex code edits. ~4,500 tokens/sec with 98% accuracy for precise code transformations. The model requires the prompt to be in the following format: <instruction>{instruction}</instruction> <code>{initial_code}</code>...

262K

$0.9

$1.9

Web search

Jul 2025

Morph: Morph V3 Fast

Morph's fastest apply model for code edits. ~10,500 tokens/sec with 96% accuracy for rapid code transformations. The model requires the prompt to be in the following format: <instruction>{instruction}</instruction> <code>{initial_code}</code> <update>{edit_snippet}</update>...

82K

$0.8

$1.2

Web search

Jul 2025

Baidu: ERNIE 4.5 VL 424B A47B

ERNIE-4.5-VL-424B-A47B is a multimodal Mixture-of-Experts (MoE) model from Baidu’s ERNIE 4.5 series, featuring 424B total parameters with 47B active per token. It is trained jointly on text and image data...

131K

$0.42

$1.25

VisionReasoningWeb search

Jun 2025

o4 Mini Deep Research

deprecated

Faster, more affordable deep research model for complex, multi-step research tasks.

200K

$2

$8

VisionReasoningTools / functionsWeb search

Jun 2025

o3 Deep Research

deprecated

Our most powerful deep research model for complex, multi-step research tasks.

200K

$10

$40

VisionReasoningTools / functionsWeb search

Jun 2025

Mistral: Mistral Small 3.2 24B

Mistral-Small-3.2-24B-Instruct-2506 is an updated 24B parameter model from Mistral optimized for instruction following, repetition reduction, and improved function calling. Compared to the 3.1 release, version 3.2 significantly improves accuracy on...

128K

$0.08

$0.2

VisionTools / functionsWeb search

Jun 2025

Mistral Small (2506)

Our latest enterprise-grade small model with the latest version released June 2025.

131K

$0.1

$0.3

VisionTools / functions

Jun 2025

MiniMax: MiniMax M1

MiniMax-M1 is a large-scale, open-weight reasoning model designed for extended context and high-efficiency inference. It leverages a hybrid Mixture-of-Experts (MoE) architecture paired with a custom "lightning attention" mechanism, allowing it...

1M

$0.4

$2.2

ReasoningTools / functionsWeb search

Jun 2025

Gemini 2.5 Pro

Stable release (June 17th, 2025) of Gemini 2.5 Pro (Version: 2.5, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,createCachedContent,batchGenerateContent])

1.1M

$1.25

$10

VisionReasoningTools / functionsWeb search

Jun 2025

Gemini 2.5 Flash

Stable version of Gemini 2.5 Flash, our mid-size multimodal model that supports up to 1 million tokens, released in June of 2025. (Version: 001, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,createCachedContent,batchGenerateContent])

1.1M

$0.3

$2.5

VisionReasoningTools / functionsWeb search

Jun 2025

o3 Pro

Version of o3 with more compute for better responses. Provides consistently better answers for complex tasks.

200K

$20

$80

VisionReasoningTools / functionsWeb searchImage output

Jun 2025

DeepSeek: R1 0528

May 28th update to the [original DeepSeek R1](/deepseek/deepseek-r1) Performance on par with [OpenAI o1](/openai/o1), but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active...

164K

$0.5

$2.15

ReasoningTools / functionsWeb search

May 2025

Anthropic: Claude Sonnet 4

Claude Sonnet 4 significantly enhances the capabilities of its predecessor, Sonnet 3.7, excelling in both coding and reasoning tasks with improved precision and controllability. Achieving state-of-the-art performance on SWE-bench (72.7%),...

1M

$3

$15

VisionTools / functionsWeb search

May 2025

Anthropic: Claude Opus 4

Claude Opus 4 is benchmarked as the world’s best coding model, at time of release, bringing sustained performance on complex, long-running tasks and agent workflows. It sets new benchmarks in...

200K

$15

$75

VisionTools / functionsWeb search

May 2025

Google: Gemma 3n 4B

Gemma 3n E4B-it is optimized for efficient execution on mobile and low-resource devices, such as phones, laptops, and tablets. It supports multimodal inputs—including text, visual data, and audio—enabling diverse tasks...

33K

$0.06

$0.12

Web search

May 2025

Google: Gemini 2.5 Pro Preview 06-05

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through responses with enhanced accuracy...

1M

$1.25

$10

VisionReasoningTools / functionsWeb search

May 2025

Gemini 2.5 Pro Preview TTS

Gemini 2.5 Pro Preview TTS (Version: gemini-2.5-pro-preview-tts-2025-05-19, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[countTokens,generateContent,batchGenerateContent])

25K

$1

-

VisionAudio output

May 2025

Gemini 2.5 Flash Preview TTS

Gemini 2.5 Flash Preview TTS (Version: gemini-2.5-flash-exp-tts-2025-05-19, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[countTokens,generateContent])

25K

$0.5

-

VisionAudio output

May 2025

mistral-medium-3

Official mistral-medium-latest Mistral AI model

262K

$0.4

$2

VisionTools / functions

May 2025

Mistral Medium (2505)

Our frontier-class multimodal model released May 2025.

131K

$0.4

$2

VisionTools / functions

May 2025

Google: Gemini 2.5 Pro Preview 05-06

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through responses with enhanced accuracy...

1M

$1.25

$10

VisionReasoningTools / functionsWeb search

May 2025

Arcee AI: Virtuoso Large

Virtuoso‑Large is Arcee's top‑tier general‑purpose LLM at 72 B parameters, tuned to tackle cross‑domain reasoning, creative writing and enterprise QA. Unlike many 70 B peers, it retains the 128 k...

131K

$0.75

$1.2

Tools / functionsWeb search

May 2025

Arcee AI: Coder Large

Coder‑Large is a 32 B‑parameter offspring of Qwen 2.5‑Instruct that has been further trained on permissively‑licensed GitHub, CodeSearchNet and synthetic bug‑fix corpora. It supports a 32k context window, enabling multi‑file...

33K

$0.5

$0.8

Web search

May 2025

Meta: Llama Guard 4 12B

Llama Guard 4 is a Llama 4 Scout-derived multimodal pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used to classify content in both LLM...

164K

$0.18

$0.18

VisionWeb search

Apr 2025

Qwen3 8b

Alibaba model (not yet curated).

131K

$0.05

$0.4

Tools / functions

Apr 2025

Qwen3 32b

Alibaba model (not yet curated).

131K

$0.29

$0.59

Tools / functions

Apr 2025

Qwen3 30b A3b

Alibaba model (not yet curated).

131K

$0.12

$0.5

Tools / functions

Apr 2025

Qwen3 235b A22b

Alibaba model (not yet curated).

131K

$0.46

$1.82

Tools / functions

Apr 2025

Qwen3 14b

Alibaba model (not yet curated).

131K

$0.1

$0.24

Tools / functions

Apr 2025

o4 Mini

deprecated

Latest o4-mini model. Optimized for fast, effective reasoning with exceptionally efficient performance in coding and visual tasks.

200K

$1.1

$4.4

VisionReasoningTools / functions

Apr 2025

o3

A well-rounded and powerful model across domains. Sets a new standard for math, science, coding, and visual reasoning tasks.

200K

$2

$8

VisionReasoningTools / functions

Apr 2025

GPT-4.1 Nano

Fastest, most cost-effective GPT 4.1 model. Delivers exceptional performance with low latency, ideal for tasks like classification or autocompletion.

1M

$0.1

$0.4

VisionTools / functions

Apr 2025

GPT-4.1 Mini

Balanced for intelligence, speed, and cost. Matches or exceeds GPT-4o in intelligence while reducing latency by nearly half and cost by 83%.

1M

$0.4

$1.6

VisionTools / functions

Apr 2025

GPT-4.1

Flagship GPT model for complex tasks. Major improvements on coding, instruction following, and long context with 1M token context window.

1M

$2

$8

VisionTools / functions

Apr 2025

GLM-4 32B (0414) 128K

GLM-4 32B model with 128K context, 16K output.

131K

$0.1

$0.1

Tools / functions

Apr 2025

Meta: Llama 4 Scout

Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model developed by Meta, activating 17 billion parameters out of a total of 109B. It supports native multimodal input...

10M

$0.1

$0.3

VisionTools / functionsWeb search

Apr 2025

Meta: Llama 4 Maverick

Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language model from Meta, built on a mixture-of-experts (MoE) architecture with 128 experts and 17 billion active parameters per forward...

1M

$0.15

$0.6

VisionTools / functionsWeb search

Apr 2025

[Meta] Llama 4 Scout · 17B × 16E (Preview)

Llama 4 Scout 17B MoE with 16 experts (109B total params), native multimodal with vision support. 131K context, 8K max output. ~750 t/s on Groq.

131K

$0.11

$0.34

Tools / functions

Apr 2025

DeepSeek: DeepSeek V3 0324

DeepSeek V3, a 685B-parameter, mixture-of-experts model, is the latest iteration of the flagship chat model family from the DeepSeek team. It succeeds the [DeepSeek V3](/deepseek/deepseek-chat-v3) model and performs really well...

164K

$0.2

$0.77

Tools / functionsWeb search

Mar 2025

o1 Pro

A version of o1 with more compute for better responses. Provides consistently better answers for complex tasks.

200K

$150

$600

VisionReasoningTools / functions

Mar 2025

Mistral: Mistral Small 3.1 24B

Mistral Small 3.1 24B Instruct is an upgraded variant of Mistral Small 3 (2501), featuring 24 billion parameters with advanced multimodal capabilities. It provides state-of-the-art performance in text-based reasoning and...

128K

$0.35

$0.56

VisionWeb search

Mar 2025

Google: Gemma 3 4B

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities,...

131K

$0.05

$0.1

VisionWeb search

Mar 2025

Google: Gemma 3 12B

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities,...

131K

$0.05

$0.15

VisionTools / functionsWeb search

Mar 2025

Cohere: Command A

Command A is an open-weights 111B parameter model with a 256k context window focused on delivering great performance across agentic, multilingual, and coding use cases. Compared to other leading proprietary...

256K

$2.5

$10

Web search

Mar 2025

Reka Flash 3

Reka Flash 3 is a general-purpose, instruction-tuned large language model with 21 billion parameters, developed by Reka. It excels at general chat, coding tasks, instruction-following, and function calling. Featuring a...

66K

$0.1

$0.2

ReasoningWeb search

Mar 2025

Google: Gemma 3 27B

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities,...

131K

$0.08

$0.16

VisionTools / functionsWeb search

Mar 2025

GPT-4o Search Preview

Latest snapshot of the GPT-4o model optimized for web search capabilities.

128K

$2.5

$10

Web search

Mar 2025

GPT-4o Mini Search Preview

deprecated

Latest snapshot of the GPT-4o Mini model optimized for web search capabilities.

128K

$0.15

$0.6

Web search

Mar 2025

TheDrummer: Skyfall 36B V2

Skyfall 36B v2 is an enhanced iteration of Mistral Small 2501, specifically fine-tuned for improved creativity, nuanced writing, role-playing, and coherent storytelling.

33K

$0.55

$0.8

Web search

Mar 2025

Perplexity: Sonar Reasoning Pro

Note: Sonar Pro pricing includes Perplexity search pricing. See [details here](https://docs.perplexity.ai/guides/pricing#detailed-pricing-breakdown-for-sonar-reasoning-pro-and-sonar-pro) Sonar Reasoning Pro is a premier reasoning model powered by DeepSeek R1 with Chain of Thou...

128K

$2

$8

VisionReasoningWeb search

Mar 2025

Perplexity: Sonar Pro

Note: Sonar Pro pricing includes Perplexity search pricing. See [details here](https://docs.perplexity.ai/guides/pricing#detailed-pricing-breakdown-for-sonar-reasoning-pro-and-sonar-pro) For enterprises seeking more advanced capabilities, the Sonar Pro API can handle in-depth,...

200K

$3

$15

VisionWeb search

Mar 2025

Perplexity: Sonar Deep Research

Sonar Deep Research is a research-focused model designed for multi-step retrieval, synthesis, and reasoning across complex topics. It autonomously searches, reads, and evaluates sources, refining its approach as it gathers...

128K

$2

$8

ReasoningWeb search

Mar 2025

Mistral: Saba

Mistral Saba is a 24B-parameter language model specifically designed for the Middle East and South Asia, delivering accurate and contextually relevant responses while maintaining efficient performance. Trained on curated regional...

33K

$0.2

$0.6

Tools / functionsWeb search

Feb 2025

Gemini 2.0 Flash 001

Stable version of Gemini 2.0 Flash, our fast and versatile multimodal model for scaling across diverse tasks, released in January of 2025. (Version: 2.0, Defaults: temperature=1, topP=0.95, topK=40, interfaces=[generateContent,countTokens,createCachedContent,batchGenerateContent])

1.1M

$0.1

$0.4

VisionTools / functions

Feb 2025

AionLabs: Aion-RP 1.0 (8B)

Aion-RP-Llama-3.1-8B ranks the highest in the character evaluation portion of the RPBench-Auto benchmark, a roleplaying-specific variant of Arena-Hard-Auto, where LLMs evaluate each other’s responses. It is a fine-tuned base model...

33K

$0.8

$1.6

Web search

Feb 2025

AionLabs: Aion-1.0-Mini

Aion-1.0-Mini 32B parameter model is a distilled version of the DeepSeek-R1 model, designed for strong performance in reasoning domains such as mathematics, coding, and logic. It is a modified variant...

131K

$0.7

$1.4

ReasoningWeb search

Feb 2025

AionLabs: Aion-1.0

Aion-1.0 is a multi-model system designed for high performance across various tasks, including reasoning and coding. It is built on DeepSeek-R1, augmented with additional models and techniques such as Tree...

131K

$4

$8

ReasoningWeb search

Feb 2025

Qwen: Qwen2.5 VL 72B Instruct

Qwen2.5-VL is proficient in recognizing common objects such as flowers, birds, fish, and insects. It is also highly capable of analyzing texts, charts, icons, graphics, and layouts within images.

131K

$0.8

$1

VisionWeb search

Feb 2025

o3 Mini

Latest o3-mini model snapshot. High intelligence at the same cost and latency targets of o1-mini. Excels at science, math, and coding tasks.

200K

$1.1

$4.4

ReasoningTools / functions

Jan 2025

Mistral: Mistral Small 3

Mistral Small 3 is a 24B-parameter language model optimized for low-latency performance across common AI tasks. Released under the Apache 2.0 license, it features both pre-trained and instruction-tuned versions designed...

33K

$0.05

$0.08

Web search

Jan 2025

Perplexity: Sonar

Sonar is lightweight, affordable, fast, and simple to use — now featuring citations and the ability to customize sources. It is designed for companies seeking to integrate lightweight question-and-answer features...

127K

$1

$1

VisionWeb search

Jan 2025

DeepSeek: R1 Distill Llama 70B

DeepSeek R1 Distill Llama 70B is a distilled large language model based on [Llama-3.3-70B-Instruct](/meta-llama/llama-3.3-70b-instruct), using outputs from [DeepSeek R1](/deepseek/deepseek-r1). The model combines advanced distillation techniques to achieve high performance acr...

128K

$0.8

$0.8

ReasoningWeb search

Jan 2025

DeepSeek: R1

DeepSeek R1 is here: Performance on par with [OpenAI o1](/openai/o1), but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active in an inference pass....

164K

$0.7

$2.5

ReasoningTools / functionsWeb search

Jan 2025

V1 8K Vision (Preview)

Legacy vision model with 8K context. Preview variant - use moonshot-v1-vision for production.

8K

$0.2

$2

Vision

Jan 2025

V1 32K Vision (Preview)

Legacy vision model with 32K context. Preview variant - use moonshot-v1-vision for production.

33K

$1

$3

Vision

Jan 2025

V1 128K Vision (Preview)

Legacy vision model with 128K context. Preview variant - use moonshot-v1-vision for production.

131K

$2

$5

Vision

Jan 2025

MiniMax: MiniMax-01

MiniMax-01 is a combines MiniMax-Text-01 for text generation and MiniMax-VL-01 for image understanding. It has 456 billion parameters, with 45.9 billion parameters activated per inference, and can handle a context...

1M

$0.2

$1.1

VisionWeb search

Jan 2025

Microsoft: Phi 4

(/microsoft) Phi-4 is designed to perform well in complex reasoning tasks and can operate efficiently in situations with limited memory or where quick responses are needed. At 14 billion...

16K

$0.07

$0.14

Web search

Jan 2025

Sao10K: Llama 3.1 70B Hanami x1

This is [Sao10K](/sao10k)'s experiment over [Euryale v2.2](/sao10k/l3.1-euryale-70b).

16K

$3

$3

Web search

Jan 2025

DeepSeek: DeepSeek V3

DeepSeek-V3 is the latest model from the DeepSeek team, building upon the instruction following and coding abilities of the previous versions. Pre-trained on nearly 15 trillion tokens, the reported evaluations...

131K

$0.2

$0.8

Tools / functionsWeb search

Dec 2024

Sao10K: Llama 3.3 Euryale 70B

Euryale L3.3 70B is a model focused on creative roleplay from [Sao10k](https://ko-fi.com/sao10k). It is the successor of [Euryale L3 70B v2.2](/models/sao10k/l3-euryale-70b).

131K

$0.65

$0.75

Web search

Dec 2024

o1

Previous full o-series reasoning model.

200K

$15

$60

VisionReasoningTools / functions

Dec 2024

Cohere: Command R7B (12-2024)

Command R7B (12-2024) is a small, fast update of the Command R+ model, delivered in December 2024. It excels at RAG, tool use, agents, and similar tasks requiring complex reasoning...

128K

$0.04

$0.15

Web search

Dec 2024

Meta: Llama 3.3 70B Instruct (free) · 🎁

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model...

131K

-

-

Tools / functionsWeb search

Dec 2024

[Meta] Llama 3.3 ¡ 70B Versatile

Meta Llama 3.3 (70B params) with GQA. Strong reasoning, coding, multilingual. 131K context, 32K max output. ~280 t/s on Groq.

131K

$0.59

$0.79

Tools / functions

Dec 2024

Amazon: Nova Pro 1.0

Amazon Nova Pro 1.0 is a capable multimodal model from Amazon focused on providing a combination of accuracy, speed, and cost for a wide range of tasks. As of December...

300K

$0.8

$3.2

VisionTools / functionsWeb search

Dec 2024

Amazon: Nova Micro 1.0

Amazon Nova Micro 1.0 is a text-only model that delivers the lowest latency responses in the Amazon Nova family of models at a very low cost. With a context length...

128K

$0.04

$0.14

Tools / functionsWeb search

Dec 2024

Amazon: Nova Lite 1.0

Amazon Nova Lite 1.0 is a very low-cost multimodal model from Amazon that focused on fast processing of image, video, and text inputs to generate text output. Amazon Nova Lite...

300K

$0.06

$0.24

VisionTools / functionsWeb search

Dec 2024

Mistral Large 2407

This is Mistral AI's flagship model, Mistral Large 2 (version mistral-large-2407). It's a proprietary weights-available model and excels at reasoning, code, JSON, chat, and more. Read the launch announcement [here](https://mistral.ai/news/mistral-large-2407/)....

131K

$2

$6

Tools / functionsWeb search

Nov 2024

Qwen2.5 Coder 32B Instruct

Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). Qwen2.5-Coder brings the following improvements upon CodeQwen1.5: - Significantly improvements in **code generation**, **code reasoning**...

128K

$0.66

$1

Web search

Nov 2024

TheDrummer: UnslopNemo 12B

UnslopNemo v4.1 is the latest addition from the creator of Rocinante, designed for adventure writing and role-play scenarios.

33K

$0.4

$0.4

Tools / functionsWeb search

Nov 2024

Magnum v4 72B

This is a series of models designed to replicate the prose quality of the Claude 3 models, specifically Sonnet(https://openrouter.ai/anthropic/claude-3.5-sonnet) and Opus(https://openrouter.ai/anthropic/claude-3-opus). The model is fine-tuned on top of [Qwen2.5 72B](https://o...

33K

$3

$5

Web search

Oct 2024

Qwen: Qwen2.5 7B Instruct

Qwen2.5 7B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2: - Significantly more knowledge and has greatly improved capabilities in coding and...

131K

$0.04

$0.1

Tools / functionsWeb search

Oct 2024

Inflection: Inflection 3 Productivity

Inflection 3 Productivity is optimized for following instructions. It is better for tasks requiring JSON output or precise adherence to provided guidelines. It has access to recent news. For emotional...

8K

$2.5

$10

Web search

Oct 2024

Inflection: Inflection 3 Pi

Inflection 3 Pi powers Inflection's [Pi](https://pi.ai) chatbot, including backstory, emotional intelligence, productivity, and safety. It has access to recent news, and excels in scenarios like customer support and roleplay. Pi...

8K

$2.5

$10

Web search

Oct 2024

TheDrummer: Rocinante 12B

Rocinante 12B is designed for engaging storytelling and rich prose. Early testers have reported: - Expanded vocabulary with unique and expressive word choices - Enhanced creativity for vivid narratives -...

33K

$0.25

$0.5

Web search

Sep 2024

Meta: Llama 3.2 3B Instruct (free) · 🎁

Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for advanced natural language processing tasks like dialogue generation, reasoning, and summarization. Designed with the latest transformer architecture, it...

131K

-

-

Web search

Sep 2024

Meta: Llama 3.2 1B Instruct

Llama 3.2 1B is a 1-billion-parameter language model focused on efficiently performing natural language tasks, such as summarization, dialogue, and multilingual text analysis. Its smaller size allows it to operate...

131K

$0.03

$0.2

Web search

Sep 2024

Meta: Llama 3.2 11B Vision Instruct

Llama 3.2 11B Vision is a multimodal model with 11 billion parameters, designed to handle tasks combining visual and textual data. It excels in tasks such as image captioning and...

131K

$0.35

$0.35

VisionWeb search

Sep 2024

Qwen2.5 72B Instruct

Qwen2.5 72B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2: - Significantly more knowledge and has greatly improved capabilities in coding and...

131K

$0.36

$0.4

Tools / functionsWeb search

Sep 2024

Cohere: Command R+ (08-2024)

command-r-plus-08-2024 is an update of the [Command R+](/models/cohere/command-r-plus) with roughly 50% higher throughput and 25% lower latencies as compared to the previous Command R+ version, while keeping the hardware footprint...

128K

$2.5

$10

Tools / functionsWeb search

Aug 2024

Cohere: Command R (08-2024)

command-r-08-2024 is an update of the [Command R](/models/cohere/command-r) with improved performance for multilingual retrieval-augmented generation (RAG) and tool use. More broadly, it is better at math, code and reasoning and...

128K

$0.15

$0.6

Tools / functionsWeb search

Aug 2024

Sao10K: Llama 3.1 Euryale 70B v2.2

Euryale L3.1 70B v2.2 is a model focused on creative roleplay from [Sao10k](https://ko-fi.com/sao10k). It is the successor of [Euryale L3 70B v2.1](/models/sao10k/l3-euryale-70b).

131K

$0.85

$0.85

Tools / functionsWeb search

Aug 2024

Nous: Hermes 3 70B Instruct

Hermes 3 is a generalist language model with many improvements over [Hermes 2](/models/nousresearch/nous-hermes-2-mistral-7b-dpo), including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements acr...

131K

$0.7

$0.7

Web search

Aug 2024

Nous: Hermes 3 405B Instruct (free) · 🎁

Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the...

131K

-

-

Web search

Aug 2024

Sao10K: Llama 3 8B Lunaris

Lunaris 8B is a versatile generalist and roleplaying model based on Llama 3. It's a strategic merge of multiple models, designed to balance creativity with improved logic and general knowledge....

8K

$0.04

$0.05

Web search

Aug 2024

Meta: Llama 3.1 8B Instruct

Meta's latest class of model (Llama 3.1) launched with a variety of sizes & flavors. This 8B instruct-tuned version is fast and efficient. It has demonstrated strong performance compared to...

131K

$0.02

$0.03

Tools / functionsWeb search

Jul 2024

Meta: Llama 3.1 70B Instruct

Meta's latest class of model (Llama 3.1) launched with a variety of sizes & flavors. This 70B instruct-tuned version is optimized for high quality dialogue usecases. It has demonstrated strong...

131K

$0.4

$0.4

Tools / functionsWeb search

Jul 2024

[Meta] Llama 3.1 ¡ 8B Instant

Meta Llama 3.1 (8B params). Fast, cost-effective for high-volume tasks. 131K context and max output. ~560 t/s on Groq.

131K

$0.05

$0.08

Tools / functions

Jul 2024

Mistral: Mistral Nemo

A 12B parameter model with a 128k token context length built by Mistral in collaboration with NVIDIA. The model is multilingual, supporting English, French, German, Spanish, Italian, Portuguese, Chinese, Japanese,...

131K

$0.02

$0.03

Tools / functionsWeb search

Jul 2024

open-mistral-nemo-2407

Our best multilingual open source model released July 2024.

131K

$0.15

$0.15

Tools / functions

Jul 2024

open-mistral-nemo

Our best multilingual open source model released July 2024.

131K

$0.15

$0.15

Tools / functions

Jul 2024

GPT-4o Mini

Affordable model for fast, lightweight tasks. GPT-4o Mini is cheaper and more capable than GPT-3.5 Turbo.

128K

$0.15

$0.6

VisionTools / functions

Jul 2024

Google: Gemma 2 27B

Gemma 2 27B by Google is an open model built from the same research and technology used to create the [Gemini models](/models?q=gemini). Gemma models are well-suited for a variety of...

8K

$0.65

$0.65

Web search

Jul 2024

GPT-4o

deprecated

Original gpt-4o snapshot from May 13, 2024.

128K

$5

$15

VisionTools / functions

May 2024

Meta: Llama 3 8B Instruct

Meta's latest class of model (Llama 3) launched with a variety of sizes & flavors. This 8B instruct-tuned version was optimized for high quality dialogue usecases. It has demonstrated strong...

8K

$0.14

$0.14

Web search

Apr 2024

Mistral: Mixtral 8x22B Instruct

Mistral's official instruct fine-tuned version of [Mixtral 8x22B](/models/mistralai/mixtral-8x22b). It uses 39B active parameters out of 141B, offering unparalleled cost efficiency for its size. Its strengths include: - strong math, coding,...

66K

$2

$6

Tools / functionsWeb search

Apr 2024

WizardLM-2 8x22B

WizardLM-2 8x22B is Microsoft AI's most advanced Wizard model. It demonstrates highly competitive performance compared to leading proprietary models, and it consistently outperforms all existing state-of-the-art opensource models. It is...

66K

$0.62

$0.62

Web search

Apr 2024

GPT-4 Turbo

deprecated

GPT-4 Turbo with Vision model. Vision requests can now use JSON mode and function calling. gpt-4-turbo currently

128K

$10

$30

VisionTools / functions

Apr 2024

Anthropic: Claude 3 Haiku

Claude 3 Haiku is Anthropic's fastest and most compact model for near-instant responsiveness. Quick and accurate targeted performance. See the launch announcement and benchmark results [here](https://www.anthropic.com/news/claude-3-haiku) #multimodal

200K

$0.25

$1.25

VisionTools / functionsWeb search

Mar 2024

Mistral Large

This is Mistral AI's flagship model, Mistral Large 2 (version `mistral-large-2407`). It's a proprietary weights-available model and excels at reasoning, code, JSON, chat, and more. Read the launch announcement [here](https://mistral.ai/news/mistral-large-2407/)....

128K

$2

$6

Tools / functionsWeb search

Feb 2024

V1 8K

Legacy V1 model with 8K context. Deprecated - use Kimi K2 Instruct instead.

8K

$0.2

$2

Tools / functions

Feb 2024

V1 32K

Legacy V1 model with 32K context. Deprecated - use Kimi K2 Instruct instead.

33K

$1

$3

Tools / functions

Feb 2024

V1 128K

Legacy V1 model with 128K context. Deprecated - use Kimi K2 Instruct instead.

131K

$2

$5

Tools / functions

Feb 2024

OpenAI: GPT-4 Turbo Preview

The preview GPT-4 model with improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more. Training data: up to Dec 2023. **Note:** heavily rate limited by OpenAI while...

128K

$10

$30

Tools / functionsWeb search

Jan 2024

OpenAI: GPT-3.5 Turbo (older v0613)

GPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural language or code, and is optimized for chat and traditional completion tasks. Training data up to Sep 2021.

4K

$1

$2

Tools / functionsWeb search

Jan 2024

3.5-Turbo

The latest GPT-3.5 Turbo model with higher accuracy at responding in requested formats.

16K

$0.5

$1.5

Tools / functions

Jan 2024

3.5-Turbo

deprecated

The latest GPT-3.5 Turbo model with higher accuracy at responding in requested formats.

16K

$0.5

$1.5

Tools / functions

Jan 2024

mistral-medium

Official mistral-medium-latest Mistral AI model

262K

$0.4

$2

VisionTools / functions

Dec 2023

Auto Router

Your prompt will be processed by a meta-model and routed to one of dozens of models (see below), optimizing for the best possible output. To see which model was used,...

2M

-

-

VisionReasoningTools / functionsWeb searchImage output

Nov 2023

3.5-Turbo

deprecated

GPT-3.5 Turbo model with improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more.

16K

$1

$2

Tools / functions

Nov 2023

OpenAI: GPT-3.5 Turbo Instruct

This model is a variant of GPT-3.5 Turbo tuned for instructional prompts and omitting chat-related optimizations. Training data: up to Sep 2021.

4K

$1.5

$2

Web search

Sep 2023

OpenAI: GPT-3.5 Turbo 16k

This model offers four times the context length of gpt-3.5-turbo, allowing it to support approximately 20 pages of text in a single request at a higher cost. Training data: up...

16K

$3

$4

Tools / functionsWeb search

Aug 2023

Mancer: Weaver (alpha)

An attempt to recreate Claude-style verbosity, but don't expect the same level of coherence or memory. Meant for use in roleplay/narrative situations.

8K

$0.75

$1

Web search

Aug 2023

ReMM SLERP 13B

A recreation trial of the original MythoMax-L2-B13 but with updated models. #merge

6K

$0.45

$0.65

Web search

Jul 2023

MythoMax 13B

One of the highest performing and most popular fine-tunes of Llama 2 13B, with rich descriptions and roleplay. #merge

4K

$0.06

$0.06

Web search

Jul 2023

GPT-4

deprecated

Snapshot of gpt-4 from June 13th 2023 with improved function calling support. Data up to Sep 2021.

8K

$30

$60

Tools / functions

Jun 2023

GPT-4

Snapshot of gpt-4 from June 13th 2023 with improved function calling support. Data up to Sep 2021.

8K

$30

$60

Tools / functions

Jun 2023

[?] Qwen Plus [latest]

Balanced quality, speed, and cost with hybrid thinking. 1M context.

1M

$0.4

$1.2

ReasoningTools / functions

-

Labs Leanstral 1 5 1

A mid & post-trained version of mistral small 4 for Lean (260618 SFT)

262K

-

-

VisionTools / functions

-

labs-leanstral-1-5

A mid & post-trained version of mistral small 4 for Lean (260618 SFT)

262K

-

-

VisionTools / functions

-

mistral-code-agent-latest

Official devstral-2512 Mistral AI model

262K

-

-

Tools / functions

-

mistral-code-fim-latest

Our cutting-edge language model for coding released August 2025.

256K

-

-

Tools / functions

-

mistral-code-latest

Our cutting-edge language model for coding released August 2025.

256K

-

-

Tools / functions

-

mistral-tiny-latest

Our best multilingual open source model released July 2024.

131K

-

-

Tools / functions

-

mistral-vibe-cli-fast

Mistral Small 4.

262K

-

-

VisionTools / functions

-

mistral-vibe-cli-with-tools

Official mistral-medium-latest Mistral AI model

262K

-

-

VisionTools / functions

-

Open Mistral Nemo

Our best multilingual open source model released July 2024.

131K

-

-

Tools / functions

-

Qvq Max

Alibaba model (not yet curated).

131K

-

-

Tools / functions

-

Qwen Coder Plus

Alibaba model (not yet curated).

131K

-

-

Tools / functions

-

Qwen Flash

Fast and very low cost with hybrid thinking. 1M context.

1M

$0.05

$0.4

ReasoningTools / functions

-

Qwen Max

Best quality of the stable commercial line. 32K context.

33K

$1.6

$6.4

Tools / functions

-

Qwen Turbo

Fastest and cheapest for simple tasks. 1M context.

1M

$0.05

$0.2

Tools / functions

-

Qwen Vl Max

Alibaba model (not yet curated).

131K

-

-

Tools / functions

-

Qwen Vl Plus

Alibaba model (not yet curated).

131K

-

-

Tools / functions

-

Qwen3 235b A22b Instruct 2507

Alibaba model (not yet curated).

131K

-

-

Tools / functions

-

Qwen3 Coder 480b A35b Instruct

Alibaba model (not yet curated).

131K

-

-

Tools / functions

-

Qwen3 Max Preview

Alibaba model (not yet curated).

131K

-

-

Tools / functions

-

Qwen3 Vl Flash 2025 10 15

Alibaba model (not yet curated).

131K

-

-

Tools / functions

-

Qwen3.5 Flash 2026 02 23

Alibaba model (not yet curated).

131K

-

-

Tools / functions

-

Qwq Plus 2025 03 05

Alibaba model (not yet curated).

131K

-

-

Tools / functions

-

Nano Banana 2 Lite

NEW
Jul 2026

Gemini 3.1 Flash Lite Image. (Version: 3.0, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,batchGenerateContent])

VisionTools / functions
131K ¡ in $0.25 ¡ out $1.5

Gemini Omni Flash Preview

NEW
Jul 2026

Gemini Omni Flash Preview (Version: 001, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens])

VisionTools / functions
197K ¡ in - ¡ out -

Gemma 4 31B (Preview)

NEW
Jun 2026

Google Gemma 4 31B on Cerebras - first multimodal model on wafer-scale inference (~1,850 tok/s). Vision (base64 PNG/JPEG, max 5 images / 10MB), function calling, reasoning (off by default, enable via effort). 131K context (65K free tier), 40K max output.

VisionReasoningTools / functions
131K ¡ in $0.99 ¡ out $1.49

Claude Sonnet 5

NEW
Jun 2026

Best combination of speed and intelligence, with the largest gains in coding and agentic tasks

VisionTools / functionsWeb search
1M ¡ in $2 ¡ out $10

Kimi K2.7 Code (Alibaba)

NEW
Jun 2026

Moonshot Kimi K2.7 Code served via Alibaba Model Studio. Multimodal, always-on thinking, 256K context. (Alibaba pricing not yet published.)

VisionReasoningTools / functions
262K ¡ in $0.95 ¡ out $4

DeepSeek V4 Pro (Alibaba)

NEW
Jun 2026

DeepSeek V4 Pro served via Alibaba Model Studio (Alibaba pricing, ~5x DeepSeek-direct). 1M context, thinking.

ReasoningTools / functions
1M ¡ in $2.4 ¡ out $4.8

DeepSeek V3.2 (Alibaba)

NEW
Jun 2026

DeepSeek V3.2 served via Alibaba Model Studio (superseded by V4). Thinking.

ReasoningTools / functions
131K ¡ in $0.57 ¡ out $1.71

Sakana Fugu Ultra

NEW
Jun 2026

Multi-agent conductor system routing 1-3 expert agents for complex, multi-step reasoning - maximum answer quality on hard tasks. 1M context.

VisionReasoningTools / functionsWeb search
1M ¡ in $5 ¡ out $30

Sakana Fugu

NEW
Jun 2026

Fast orchestration model routing tasks across a swappable pool of frontier LLMs - low latency, high quality. 1M context. Billed at the routed underlying model's standard rate.

VisionReasoningTools / functionsWeb search
1M ¡ in - ¡ out -

Qwen3.6 Flash

NEW
Jun 2026

Fast, cost-effective multimodal model with 1M context, near-flagship quality, vision/video, and built-in tools.

VisionReasoningTools / functions
1M ¡ in $0.25 ¡ out $1.5

DeepSeek V4 Flash (Alibaba)

NEW
Jun 2026

DeepSeek V4 Flash served via Alibaba Model Studio. 1M context, thinking.

ReasoningTools / functions
1M ¡ in $0.2 ¡ out $0.4

[?] Qwen3.7 Max [preview]

NEW
Jun 2026

Flagship agent model with native extended thinking and 1M context. Text-only; strong at coding, productivity, and long-horizon autonomous tasks.

ReasoningTools / functions
1M ¡ in $2.5 ¡ out $7.5

[?] Qwen3.7 Max [2026 05 17]

NEW
Jun 2026

Flagship agent model with native extended thinking and 1M context. Text-only; strong at coding, productivity, and long-horizon autonomous tasks.

ReasoningTools / functions
1M ¡ in $2.5 ¡ out $7.5

Cohere: North Mini Code (free) · 🎁

NEW
Jun 2026

North Mini Code is Cohere's first agentic coding model and the debut of its North family. A sparse mixture-of-experts model with 30B total parameters and 3B active, it is optimized...

ReasoningTools / functionsWeb search
256K ¡ in - ¡ out -

OpenRouter: Fusion

NEW
Jun 2026

Fusion turns your prompt into a small multi-model deliberation. A panel of expert models (see below) analyzes your prompt in parallel with web search and web fetch enabled, then a...

Web search
1M ¡ in - ¡ out -

GLM-5.2 (1M)

NEW
Jun 2026

Z.ai 1M-context flagship (744B MoE, 40B activated). Agentic coding with reasoning_effort control (high/max). 1M context, 128K output.

ReasoningTools / functions
1M ¡ in $1.4 ¡ out $4.4

Claude Fable 5

NEW
Jun 2026

Most capable widely released model for the most demanding reasoning and long-horizon agentic work

VisionReasoningTools / functionsWeb search
1M ¡ in $10 ¡ out $50

Anthropic: Claude Fable Latest

NEW
Jun 2026

This model always redirects to the latest model in the Claude Fable family.

VisionReasoningTools / functionsWeb search
1M ¡ in $10 ¡ out $50

Nex AGI: Nex-N2-Pro

NEW
Jun 2026

Nex-N2-Pro is an agentic mixture-of-experts model from Nex AGI, with 17B active parameters out of 397B total. Built on the Qwen3.5 architecture, it accepts text and image input and produces...

VisionReasoningWeb search
262K ¡ in $0.25 ¡ out $1

NVIDIA: Nemotron 3.5 Content Safety (free) · 🎁

NEW
Jun 2026

NVIDIA Nemotron 3.5 Content Safety is a compact 4B-parameter multimodal guardrail model from NVIDIA, fine-tuned from Google Gemma-3-4B. It moderates both inputs to and responses from LLMs and VLMs, accepting...

VisionReasoningWeb search
128K ¡ in - ¡ out -

NVIDIA: Nemotron 3 Ultra

NEW
Jun 2026

NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model from NVIDIA, with 55B active parameters out of 550B total (MoE). Built on a hybrid Transformer-Mamba mixture-of-experts architecture, it...

ReasoningTools / functionsWeb search
1M ¡ in $0.5 ¡ out $2.2

Qwen3.7 Plus

NEW
Jun 2026

Multimodal agent model with 1M context, native thinking, and vision/video understanding. Lower cost than Max.

VisionReasoningTools / functions
1M ¡ in $0.4 ¡ out $1.6

Kimi K2.7 Code Highspeed

NEW
Jun 2026

High-speed code variant with ~180 tok/s output (up to 260 in short contexts). Native multimodal with always-on thinking. 256K context.

VisionReasoningTools / functions
262K ¡ in $1.9 ¡ out $8

MiniMax: MiniMax M3

NEW
May 2026

MiniMax-M3 is a multimodal foundation model from MiniMax. It supports text, image, and video inputs with text output, a 1M-token context window, and is suited for long-horizon agentic work, coding,...

VisionReasoningTools / functionsWeb search
1M ¡ in $0.3 ¡ out $1.2

StepFun: Step 3.7 Flash

NEW
May 2026

Step 3.7 Flash is StepFun's latest high-efficiency multimodal Mixture-of-Experts model. It pairs a 196B-parameter language backbone with a vision encoder for native image and video understanding, activating roughly 11B parameters...

VisionReasoningTools / functionsWeb search
256K ¡ in $0.2 ¡ out $1.15

Nano Banana Pro

NEW
May 2026

Gemini 3 Pro Image (Version: 3.0, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,batchGenerateContent])

VisionReasoningTools / functionsWeb searchImage output
164K ¡ in $2 ¡ out $12

Nano Banana 2

NEW
May 2026

Gemini 3.1 Flash Image. (Version: 3.0, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,batchGenerateContent])

VisionReasoningTools / functionsWeb searchImage output
131K ¡ in $0.5 ¡ out $3

Claude Opus 4.8

NEW
May 2026

Most capable Opus-tier model for complex reasoning and agentic coding

VisionTools / functionsWeb search
1M ¡ in $5 ¡ out $25

Anthropic: Claude Opus 4.8 (Fast)

NEW
May 2026

Fast-mode variant of [Opus 4.8](/anthropic/claude-opus-4.8) - identical capabilities with higher output speed at 2x pricing relative to regular Opus 4.8. Learn more in Anthropic's docs: https://platform.claude.com/docs/en/build-with-claude/fast-mode

VisionTools / functionsWeb search
1M ¡ in $10 ¡ out $50

xAI: Grok Build 0.1

NEW
May 2026

Grok Build 0.1 is xAI’s fast coding model trained specifically for agentic software engineering workflows. It supports text and image inputs with text output, and is optimized for interactive coding...

VisionReasoningTools / functionsWeb search
256K ¡ in $1 ¡ out $2

Gemini 3.5 Flash

NEW
May 2026

Gemini 3.5 Flash (Version: 3.5-flash-05-2026, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,createCachedContent,batchGenerateContent])

VisionReasoningTools / functionsWeb search
1.1M ¡ in $1.5 ¡ out $9

Antigravity Agent Preview (2026-05)

NEW
May 2026

Preview release of Antigravity Agent (05-2026) (Version: 0.1, Defaults: temperature=undefined, topP=undefined, topK=undefined, interfaces=[generateContent,countTokens])

VisionReasoning
197K ¡ in $1.5 ¡ out $9

Qwen3 Coder Plus

May 2026

Agentic coding model with very long context. Tiered pricing by input length (up to 1M).

Tools / functions
1M ¡ in $1 ¡ out $5

Perceptron: Perceptron Mk1

May 2026

Perceptron Mk1 (Mark One) is Perceptron's highest-quality vision-language model for video and embodied reasoning.** It accepts image and video inputs paired with natural language queries, and produces detailed visual understanding...

VisionReasoningWeb search
33K ¡ in $0.15 ¡ out $1.5

Anthropic: Claude Opus 4.7 (Fast)

May 2026

Fast-mode variant of [Opus 4.7](/anthropic/claude-opus-4.7) - identical capabilities with higher output speed at premium 6x pricing. Learn more in Anthropic's docs: https://platform.claude.com/docs/en/build-with-claude/fast-mode

VisionTools / functionsWeb search
1M ¡ in $30 ¡ out $150

Qwen3.6 27b

May 2026

Alibaba model (not yet curated).

Tools / functions
131K ¡ in $0.6 ¡ out $3

inclusionAI: Ring-2.6-1T

May 2026

Ring-2.6-1T is a 1T-parameter-scale thinking model with 63B active parameters, built for real-world agent workflows that require both strong capability and operational efficiency. It is optimized for coding agents, tool...

ReasoningTools / functionsWeb search
262K ¡ in $0.08 ¡ out $0.63

Gemini 3.1 Flash-Lite

May 2026

Gemini 3.1 Flash Lite (Version: 3.1-flash-lite-05-2026, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,createCachedContent,batchGenerateContent])

VisionReasoningTools / functionsWeb search
1.1M ¡ in $0.25 ¡ out $1.5

OpenAI: GPT Chat Latest

May 2026

GPT Chat Latest

VisionTools / functionsWeb search
400K ¡ in $5 ¡ out $30

xAI: Grok 4.3

Apr 2026

Grok 4.3 is a reasoning model from xAI. It accepts text and image inputs with text output, and is suited for agentic workflows, instruction-following tasks, and applications requiring high factual...

VisionReasoningTools / functionsWeb search
1M ¡ in $1.25 ¡ out $2.5

mistral-medium-3.5

Apr 2026

Official mistral-medium-latest Mistral AI model

VisionTools / functions
262K ¡ in $1.5 ¡ out $7.5

IBM: Granite 4.1 8B

Apr 2026

Granite 4.1 8B is a dense, decoder-only 8-billion-parameter language model from IBM, part of the Granite 4.1 family. It supports a 131K-token context window and is designed for enterprise tasks...

Tools / functionsWeb search
131K ¡ in $0.05 ¡ out $0.1

[?] Qwen3 VL Plus [2025 12 19]

Apr 2026

Current vision-language model with strong visual reasoning and thinking. Tiered pricing by input length (up to 256K).

VisionReasoningTools / functions
262K ¡ in $0.2 ¡ out $1.6

Poolside: Laguna XS.2 (free) · 🎁

Apr 2026

Laguna XS.2 is the second-generation model in the XS size class from [Poolside](https://poolside.ai/), their efficient coding agent series. It combines tool calling and reasoning capabilities with a compact footprint, offering...

ReasoningTools / functionsWeb search
262K ¡ in - ¡ out -

Poolside: Laguna M.1

Apr 2026

Laguna M.1 is the flagship coding agent model from [Poolside](https://poolside.ai/), optimized for complex software engineering tasks. Designed for agentic coding workflows, it supports tool calling and reasoning, with a 256K...

ReasoningTools / functionsWeb search
262K ¡ in $0.2 ¡ out $0.4

Owl Alpha · 🎁

Apr 2026

Owl Alpha is a high-performance foundation model designed for agentic workloads. Natively supports tool use, and long-context tasks, with strong performance in code generation, automated workflows, and complex instruction execution....

Tools / functionsWeb search
1M ¡ in - ¡ out -

NVIDIA: Nemotron 3 Nano Omni (free) · 🎁

Apr 2026

NVIDIA Nemotron™ 3 Nano Omni is a 30B-A3B open multimodal model designed to function as a perception and context sub-agent in enterprise agent systems. It accepts text, image, video, and...

VisionReasoningTools / functionsWeb search
256K ¡ in - ¡ out -

mistral-medium-latest

Apr 2026

Official mistral-medium-latest Mistral AI model

VisionTools / functions
262K ¡ in $1.5 ¡ out $7.5

Mistral Medium (latest)

Apr 2026

Official mistral-medium-latest Mistral AI model

VisionTools / functions
262K ¡ in $1.5 ¡ out $7.5

Qwen3.6 Max Preview

Apr 2026

Alibaba model (not yet curated).

Tools / functions
131K ¡ in $1.04 ¡ out $6.24

Qwen3.6 35b A3b

Apr 2026

Alibaba model (not yet curated).

Tools / functions
131K ¡ in $0.14 ¡ out $1

Qwen3.5 Plus 2026 02 15

Apr 2026

Alibaba model (not yet curated).

Tools / functions
131K ¡ in $0.3 ¡ out $1.8

OpenAI GPT Mini Latest

Apr 2026

This model always redirects to the latest model in the OpenAI GPT Mini family.

VisionReasoningTools / functionsWeb search
400K ¡ in $0.75 ¡ out $4.5

OpenAI GPT Latest

Apr 2026

This model always redirects to the latest model in the OpenAI GPT family.

VisionReasoningTools / functionsWeb search
1.1M ¡ in $5 ¡ out $30

MoonshotAI Kimi Latest

Apr 2026

This model always redirects to the latest model in the MoonshotAI Kimi family.

VisionReasoningTools / functionsWeb search
262K ¡ in $0.55 ¡ out $3.2

Google Gemini Pro Latest

Apr 2026

This model always redirects to the latest model in the Google Gemini Pro family.

VisionReasoningTools / functionsWeb search
1M ¡ in $2 ¡ out $12

Google Gemini Flash Latest

Apr 2026

This model always redirects to the latest model in the Google Gemini Flash family.

VisionReasoningTools / functionsWeb search
1M ¡ in $1.5 ¡ out $9

Anthropic Claude Sonnet Latest

Apr 2026

This model always redirects to the latest model in the Anthropic Claude Sonnet family.

VisionReasoningTools / functionsWeb search
1M ¡ in $2 ¡ out $10

Anthropic Claude Haiku Latest

Apr 2026

This model always redirects to the latest model in the Anthropic Claude Haiku family.

VisionReasoningTools / functionsWeb search
200K ¡ in $1 ¡ out $5

inclusionAI: Ling-2.6-1T

Apr 2026

Ling-2.6-1T is an instant (instruct) model from inclusionAI and the company’s trillion-parameter flagship, designed for real-world agents that require fast execution and high efficiency at scale. It uses a “fast...

Tools / functionsWeb search
262K ¡ in $0.08 ¡ out $0.63

GPT-5.5 Pro

Apr 2026

Most capable model for complex tasks. Uses more compute for smarter, more precise responses on the hardest problems.

VisionReasoningTools / functionsWeb searchImage output
1.1M ¡ in $30 ¡ out $180

GPT-5.5

Apr 2026

New baseline for complex production workflows. Stronger task execution, more precise tool use, more efficient reasoning with fewer tokens. 1M token context.

VisionReasoningTools / functionsWeb searchImage output
1.1M ¡ in $5 ¡ out $30

Xiaomi: MiMo-V2.5-Pro

Apr 2026

MiMo-V2.5-Pro is Xiaomi’s flagship model, delivering strong performance in general agentic capabilities, complex software engineering, and long-horizon tasks, with top rankings on benchmarks such as ClawEval, GDPVal, and SWE-bench Pro....

ReasoningTools / functionsWeb search
1M ¡ in $0.44 ¡ out $0.87

Xiaomi: MiMo-V2.5

Apr 2026

MiMo-V2.5 is a native omnimodal model by Xiaomi. It delivers Pro-level agentic performance at roughly half the inference cost, while surpassing MiMo-V2-Omni in multimodal perception across image and video understanding...

VisionReasoningTools / functionsWeb search
1M ¡ in $0.11 ¡ out $0.28

Tencent: Hy3 preview

Apr 2026

Hy3 preview is a high-efficiency Mixture-of-Experts model from Tencent designed for agentic workflows and production use. It supports configurable reasoning levels across disabled, low, and high modes, allowing it to...

ReasoningTools / functionsWeb search
262K ¡ in $0.06 ¡ out $0.21

Pareto Code Router

Apr 2026

The Pareto Router maintains a tiered shortlist of strong coding models, ranked by [Artificial Analysis](https://artificialanalysis.ai/) coding percentiles. Set min_coding_score between 0 and 1 on the [pareto-router plugin](https://openrouter.ai/docs/guides/routing/routers/pare...

Web search
2M ¡ in - ¡ out -

OpenAI: GPT-5.4 Image 2

Apr 2026

(https://openrouter.ai/openai/gpt-5.4) Image 2 combines OpenAI's GPT-5.4 model with state-of-the-art image generation capabilities from GPT Image 2. It enables rich multimodal workflows, allowing users to seamlessly move between reasoning, coding, and...

VisionReasoningWeb searchImage output
272K ¡ in $8 ¡ out $15

inclusionAI: Ling-2.6-flash

Apr 2026

Ling-2.6-flash is an instant (instruct) model from inclusionAI with 104B total parameters and 7.4B active parameters, designed for real-world agents that require fast responses, strong execution, and high token efficiency....

Tools / functionsWeb search
262K ¡ in $0.01 ¡ out $0.03

Deep Research Preview (2026-04)

Apr 2026

Preview release (April 21th, 2026) of Deep Research (Version: deepthink-exp-05-20, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens])

VisionReasoning
197K ¡ in $1.25 ¡ out $10

Deep Research Max Preview (2026-04)

Apr 2026

Preview release (April 21st, 2026) of Deep Research Max (Version: deepthink-exp-05-20, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens])

VisionReasoning
197K ¡ in $1.25 ¡ out $10

Anthropic: Claude Opus Latest

Apr 2026

This model always redirects to the latest model in the Claude Opus family.

VisionReasoningTools / functionsWeb search
1M ¡ in $5 ¡ out $25

Kimi K2.6

Apr 2026

Native multimodal flagship (text, image, video inputs) with thinking and non-thinking modes. Stronger long-form coding, improved instruction compliance and self-correction. 256K context.

VisionTools / functions
262K ¡ in $0.95 ¡ out $4

Claude Opus 4.7

Apr 2026

Previous most capable model for complex reasoning and agentic coding

VisionTools / functionsWeb search
1M ¡ in $5 ¡ out $25

Gemini 3.1 Flash TTS Preview

Apr 2026

Gemini 3.1 Flash TTS Preview (Version: 3.1-flash-tts-preview, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,batchGenerateContent])

VisionAudio output
25K ¡ in $1 ¡ out -

Gemini Robotics-ER 1.6 Preview

Apr 2026

Gemini Robotics-ER 1.6 Preview (Version: 1.6-preview, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,createCachedContent,batchGenerateContent])

VisionReasoningTools / functions
197K ¡ in $1 ¡ out $5

GLM-5.1

Apr 2026

Z.ai flagship (744B MoE, 40B activated). Post-training upgrade over GLM-5 with stronger coding and long-horizon task autonomy. 200K context, thinking mode.

ReasoningTools / functions
205K ¡ in $1.4 ¡ out $4.4

Qwen3.6 Plus

Apr 2026

Alibaba model (not yet curated).

Tools / functions
131K ¡ in $0.33 ¡ out $1.95

Gemma 4 31B IT

Apr 2026

Gemma 4 31B IT (Version: 001, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens])

Tools / functions
295K ¡ in - ¡ out -

Gemma 4 26B A4B IT

Apr 2026

Gemma 4 26B A4B IT (Version: 001, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens])

Tools / functions
295K ¡ in - ¡ out -

GLM-5V Turbo

Apr 2026

First multimodal GLM-5 model. Vision-based coding agent with image/video/file inputs. 200K context, 128K output, thinking mode.

VisionReasoningTools / functions
205K ¡ in $1.2 ¡ out $4

Arcee AI: Trinity Large Thinking

Apr 2026

Trinity Large Thinking is a powerful open source reasoning model from the team at Arcee AI. It shows strong performance in PinchBench, agentic workloads, and reasoning tasks. Launch video: https://youtu.be/Gc82AXLa0Rg?si=4RLn6WBz33qT--B7...

ReasoningTools / functionsWeb search
262K ¡ in $0.25 ¡ out $0.8

xAI: Grok 4.20 Multi-Agent

Mar 2026

Grok 4.20 Multi-Agent is a variant of xAI’s Grok 4.20 designed for collaborative, agent-based workflows. Multiple agents operate in parallel to conduct deep research, coordinate tool use, and synthesize information...

VisionReasoningWeb search
2M ¡ in $1.25 ¡ out $2.5

xAI: Grok 4.20

Mar 2026

Grok 4.20 is a reasoning model from xAI with industry-leading speed and agentic tool calling capabilities. It combines the lowest hallucination rate on the market with strict prompt adherance, delivering...

VisionReasoningTools / functionsWeb search
2M ¡ in $1.25 ¡ out $2.5

Google: Lyria 3 Pro Preview · 🎁

Mar 2026

Full-length songs are priced at $0.08 per song. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, you can generate high-quality, 48kHz...

VisionWeb searchAudio output
1M ¡ in - ¡ out -

Google: Lyria 3 Clip Preview · 🎁

Mar 2026

30 second duration clips are priced at $0.04 per clip. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, you can generate...

VisionWeb searchAudio output
1M ¡ in - ¡ out -

Kwaipilot: KAT-Coder-Pro V2

Mar 2026

KAT-Coder-Pro V2 is the latest high-performance model in KwaiKAT’s KAT-Coder series, designed for complex enterprise-grade software engineering and SaaS integration. It builds on the agentic coding strengths of earlier versions,...

Tools / functionsWeb search
256K ¡ in $0.3 ¡ out $1.2

Reka Edge

Mar 2026

Reka Edge is an extremely efficient 7B multimodal vision-language model that accepts image/video+text inputs and generates text outputs. This model is optimized specifically to deliver industry-leading performance in image understanding,...

VisionTools / functionsWeb search
16K ¡ in $0.1 ¡ out $0.1

MiniMax: MiniMax M2.7

Mar 2026

MiniMax-M2.7 is a next-generation large language model designed for autonomous, real-world productivity and continuous improvement. Built to actively participate in its own evolution, M2.7 integrates advanced agentic capabilities through multi-agent...

ReasoningTools / functionsWeb search
205K ¡ in $0.18 ¡ out $0.72

GPT-5.4 Nano

Mar 2026

Cheapest GPT-5.4-class model for simple high-volume tasks like classification and data extraction.

VisionReasoningTools / functionsWeb searchImage output
400K ¡ in $0.2 ¡ out $1.25

GPT-5.4 Mini

Mar 2026

Strongest mini model for coding, computer use, and subagents. GPT-5.4-class intelligence at lower cost and latency.

VisionReasoningTools / functionsWeb searchImage output
400K ¡ in $0.75 ¡ out $4.5

mistral-small-latest

Mar 2026

Mistral Small 4.

VisionTools / functions
262K ¡ in $0.15 ¡ out $0.6

Mistral Small (2603)

Mar 2026

Mistral Small 4.

VisionTools / functions
262K ¡ in $0.15 ¡ out $0.6

Leanstral (2603)

Mar 2026

A mid & post-trained version of mistral small 4 for Lean

VisionTools / functions
197K ¡ in - ¡ out -

GLM-5 Turbo

Mar 2026

Speed-optimized GLM-5 variant for agent workflows. Enhanced tool invocation and long-chain execution. 200K context, thinking mode.

ReasoningTools / functions
205K ¡ in $1.2 ¡ out $4

NVIDIA: Nemotron 3 Super (free) · 🎁

Mar 2026

NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B parameters for maximum compute efficiency and accuracy in complex multi-agent applications. Built on a hybrid Mamba-Transformer...

ReasoningTools / functionsWeb search
1M ¡ in - ¡ out -

Qwen: Qwen3.5-9B

Mar 2026

Qwen3.5-9B is a multimodal foundation model from the Qwen3.5 family, designed to deliver strong reasoning, coding, and visual understanding in an efficient 9B-parameter architecture. It uses a unified vision-language design...

VisionReasoningTools / functionsWeb search
262K ¡ in $0.1 ¡ out $0.15

ByteDance Seed: Seed-2.0-Lite

Mar 2026

Seed-2.0-Lite is a versatile, cost‑efficient enterprise workhorse that delivers strong multimodal and agent capabilities while offering noticeably lower latency, making it a practical default choice for most production workloads across...

VisionReasoningTools / functionsWeb search
262K ¡ in $0.25 ¡ out $2

GPT-5.4 Pro

Mar 2026

Most capable model for complex tasks. Uses more compute for smarter, more precise responses on difficult problems.

VisionReasoningTools / functionsWeb searchImage output
1.1M ¡ in $30 ¡ out $180

GPT-5.4

Mar 2026

Most capable and efficient frontier model for professional work. Native computer use, improved reasoning, coding, and agentic workflows with 1M token context.

VisionReasoningTools / functionsWeb searchImage output
1.1M ¡ in $2.5 ¡ out $15

Inception: Mercury 2

Mar 2026

Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion LLM (dLLM). Instead of generating tokens sequentially, Mercury 2 produces and refines multiple tokens in parallel, achieving...

ReasoningTools / functionsWeb search
128K ¡ in $0.25 ¡ out $0.75

OpenAI: GPT-5.3 Chat

Mar 2026

GPT-5.3 Chat is an update to ChatGPT's most-used model that makes everyday conversations smoother, more useful, and more directly helpful. It delivers more accurate answers with better contextualization and significantly...

VisionTools / functionsWeb search
128K ¡ in $1.75 ¡ out $14

GPT-5.3 Instant

deprecated
Mar 2026

GPT-5.3 Instant model, previously powering ChatGPT. Replaced by GPT-5.5 Instant.

VisionTools / functionsWeb searchImage output
128K ¡ in $1.75 ¡ out $14

Gemini 3.1 Flash-Lite Preview

Mar 2026

Gemini 3.1 Flash Lite Preview (Version: 3.1-flash-lite-preview-03-2026, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,createCachedContent,batchGenerateContent])

VisionReasoningTools / functionsWeb search
1.1M ¡ in $0.25 ¡ out $1.5

Nano Banana 2 Preview

Feb 2026

Gemini 3.1 Flash Image Preview. (Version: 3.0, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,batchGenerateContent])

VisionReasoningTools / functionsWeb searchImage output
131K ¡ in $0.5 ¡ out $3

ByteDance Seed: Seed-2.0-Mini

Feb 2026

Seed-2.0-mini targets latency-sensitive, high-concurrency, and cost-sensitive scenarios, emphasizing fast response and flexible inference deployment. It delivers performance comparable to ByteDance-Seed-1.6, supports 256k context, four reasoning effort modes (minimal/low/mediu...

VisionReasoningTools / functionsWeb search
262K ¡ in $0.1 ¡ out $0.4

Qwen3.5 35b A3b

Feb 2026

Alibaba model (not yet curated).

Tools / functions
131K ¡ in $0.14 ¡ out $1

Qwen3.5 27b

Feb 2026

Alibaba model (not yet curated).

Tools / functions
131K ¡ in $0.2 ¡ out $1.56

Qwen3.5 122b A10b

Feb 2026

Alibaba model (not yet curated).

Tools / functions
131K ¡ in $0.26 ¡ out $2.08

Qwen: Qwen3.5-Flash

Feb 2026

The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. Compared to the...

VisionReasoningTools / functionsWeb search
1M ¡ in $0.07 ¡ out $0.26

LiquidAI: LFM2-24B-A2B

Feb 2026

LFM2-24B-A2B is the largest model in the LFM2 family of hybrid architectures designed for efficient on-device deployment. Built as a 24B parameter Mixture-of-Experts model with only 2B active parameters per...

Web search
128K ¡ in $0.03 ¡ out $0.12

GPT Audio 1.5

Feb 2026

Best voice model for audio in, audio out with Chat Completions. Accepts audio inputs and outputs.

Audio output
128K ¡ in $2.5 ¡ out $10

AionLabs: Aion-2.0

Feb 2026

Aion-2.0 is a variant of DeepSeek V3.2 optimized for immersive roleplaying and storytelling. It is particularly strong at introducing tension, crises, and conflict into stories, making narratives feel more engaging....

ReasoningWeb search
131K ¡ in $0.8 ¡ out $1.6

Gemini 3.1 Pro Preview (Custom Tools)

Feb 2026

Gemini 3.1 Pro Preview optimized for custom tool usage (Version: 3.1-pro-preview-01-2026, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,createCachedContent,batchGenerateContent])

VisionReasoningTools / functionsWeb search
1.1M ¡ in $2 ¡ out $12

Gemini 3.1 Pro Preview

Feb 2026

Gemini 3.1 Pro Preview (Version: 3.1-pro-preview-01-2026, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,createCachedContent,batchGenerateContent])

VisionReasoningTools / functionsWeb search
1.1M ¡ in $2 ¡ out $12

Claude Sonnet 4.6

Feb 2026

Best combination of speed and intelligence for everyday tasks

VisionTools / functionsWeb search
1M ¡ in $3 ¡ out $15

Qwen3.5 397b A17b

Feb 2026

Alibaba model (not yet curated).

Tools / functions
131K ¡ in $0.39 ¡ out $2.45

Qwen: Qwen3.5 Plus 2026-02-15

Feb 2026

The Qwen3.5 native vision-language series Plus models are built on a hybrid architecture that integrates linear attention mechanisms with sparse mixture-of-experts models, achieving higher inference efficiency. In a variety of...

VisionReasoningTools / functionsWeb search
1M ¡ in $0.26 ¡ out $1.56

MiniMax: MiniMax M2.5

Feb 2026

MiniMax-M2.5 is a SOTA large language model designed for real-world productivity. Trained in a diverse range of complex real-world digital working environments, M2.5 builds upon the coding expertise of M2.1...

ReasoningTools / functionsWeb search
205K ¡ in $0.12 ¡ out $0.48

GLM-5

Feb 2026

Z.ai flagship foundation model (744B MoE, 40B activated). Designed for Agentic Engineering with SOTA coding and agent capabilities. 200K context, thinking mode.

ReasoningTools / functions
205K ¡ in $1 ¡ out $3.2

Qwen: Qwen3 Max Thinking

Feb 2026

Qwen3-Max-Thinking is the flagship reasoning model in the Qwen3 series, designed for high-stakes cognitive tasks that require deep, multi-step reasoning. By significantly scaling model capacity and reinforcement learning compute, it...

ReasoningTools / functionsWeb search
262K ¡ in $0.78 ¡ out $3.9

GPT-5.3 Codex

Feb 2026

Most capable agentic coding model. Combines frontier coding performance of GPT-5.2-Codex with reasoning and professional knowledge of GPT-5.2. ~25% faster.

VisionReasoningTools / functionsWeb searchImage output
400K ¡ in $1.75 ¡ out $14

Claude Opus 4.6

Feb 2026

Previous most intelligent model for complex agents and coding, with adaptive thinking

VisionTools / functionsWeb search
1M ¡ in $5 ¡ out $25

Qwen3 Coder Next

Feb 2026

Alibaba model (not yet curated).

Tools / functions
131K ¡ in $0.11 ¡ out $0.8

GLM-OCR (Vision, OCR)

Feb 2026

Specialized OCR model for text extraction from images and documents.

Vision
131K ¡ in $0.03 ¡ out $0.03

Free Models Router · 🎁

Feb 2026

The simplest way to get free inference. openrouter/free is a router that selects free models at random from the models available on OpenRouter. The router smartly filters for models that...

VisionReasoningTools / functionsWeb search
200K ¡ in - ¡ out -

StepFun: Step 3.5 Flash

Jan 2026

Step 3.5 Flash is StepFun's most capable open-source foundation model. Built on a sparse Mixture of Experts (MoE) architecture, it selectively activates only 11B of its 196B parameters per token....

ReasoningTools / functionsWeb search
262K ¡ in $0.1 ¡ out $0.3

Upstage: Solar Pro 3

Jan 2026

Solar Pro 3 is Upstage's powerful Mixture-of-Experts (MoE) language model. With 102B total parameters and 12B active parameters per forward pass, it delivers exceptional performance while maintaining computational efficiency. Optimized...

ReasoningTools / functionsWeb search
128K ¡ in $0.15 ¡ out $0.6

Kimi K2.5

Jan 2026

Supports vision (images/videos), thinking mode, and Agent tasks. 256K context.

VisionTools / functions
262K ¡ in $0.6 ¡ out $3

MiniMax: MiniMax M2-her

Jan 2026

MiniMax M2-her is a dialogue-first large language model built for immersive roleplay, character-driven chat, and expressive multi-turn conversations. Designed to stay consistent in tone and personality, it supports rich message...

Web search
66K ¡ in $0.3 ¡ out $1.2

Writer: Palmyra X5

Jan 2026

Palmyra X5 is Writer's most advanced model, purpose-built for building and scaling AI agents across the enterprise. It delivers industry-leading speed and efficiency on context windows up to 1 million...

Web search
1M ¡ in $0.6 ¡ out $6

LiquidAI: LFM2.5-1.2B-Thinking (free) · 🎁

Jan 2026

LFM2.5-1.2B-Thinking is a lightweight reasoning-focused model optimized for agentic tasks, data extraction, and RAG—while still running comfortably on edge devices. It supports long context (up to 32K tokens) and is...

ReasoningTools / functionsWeb search
33K ¡ in - ¡ out -

LiquidAI: LFM2.5-1.2B-Instruct (free) · 🎁

Jan 2026

LFM2.5-1.2B-Instruct is a compact, high-performance instruction-tuned model built for fast on-device AI. It delivers strong chat quality in a 1.2B parameter footprint, with efficient edge inference and broad runtime support.

Web search
33K ¡ in - ¡ out -

GLM-4.7 FlashX

Jan 2026

Fast GLM-4.7 variant with priority routing and higher concurrency. Same model as Flash, better infrastructure.

ReasoningTools / functions
131K ¡ in $0.07 ¡ out $0.4

GLM-4.7 Flash (Free)

Jan 2026

Free GLM-4.7 variant. Same model as FlashX but with limited concurrency (1 concurrent request) and lower priority.

ReasoningTools / functions
131K ¡ in - ¡ out -

Z.ai GLM 4.7 (Preview)

Jan 2026

Z.ai GLM 4.7 (355B) on Cerebras (~1,000 tok/s). Strong agentic coding, advanced reasoning (on by default), superior tool use. 131K context, 40K max output.

ReasoningTools / functions
131K ¡ in $2.25 ¡ out $2.75

MiniMax: MiniMax M2.1

Dec 2025

MiniMax-M2.1 is a lightweight, state-of-the-art large language model optimized for coding, agentic workflows, and modern application development. With only 10 billion activated parameters, it delivers a major jump in real-world...

ReasoningTools / functionsWeb search
205K ¡ in $0.29 ¡ out $0.95

ByteDance Seed: Seed 1.6 Flash

Dec 2025

Seed 1.6 Flash is an ultra-fast multimodal deep thinking model by ByteDance Seed, supporting both text and visual understanding. It features a 256k context window and can generate outputs of...

VisionReasoningTools / functionsWeb search
262K ¡ in $0.08 ¡ out $0.3

ByteDance Seed: Seed 1.6

Dec 2025

Seed 1.6 is a general-purpose model released by the ByteDance Seed team. It incorporates multimodal capabilities and adaptive deep thinking with a 256K context window.

VisionReasoningTools / functionsWeb search
262K ¡ in $0.25 ¡ out $2

GLM-4.7

Dec 2025

Latest-gen GLM model with 128K context. Thinking mode activated by default.

ReasoningTools / functions
131K ¡ in $0.6 ¡ out $2.2

Gemini 3 Flash Preview

Dec 2025

Gemini 3 Flash Preview (Version: 3-flash-preview-12-2025, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,createCachedContent,batchGenerateContent])

VisionReasoningTools / functionsWeb search
1.1M ¡ in $0.5 ¡ out $3

GPT Audio Mini

Dec 2025

Cost-efficient audio model. Accepts audio inputs and outputs via Chat Completions REST API.

Audio output
128K ¡ in $0.6 ¡ out $2.4

NVIDIA: Nemotron 3 Nano 30B A3B

Dec 2025

NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and accuracy for developers to build specialized agentic AI systems. The model is fully...

ReasoningTools / functionsWeb search
262K ¡ in $0.05 ¡ out $0.2

OpenAI: GPT-5.2 Chat

Dec 2025

GPT-5.2 Chat (AKA Instant) is the fast, lightweight member of the 5.2 family, optimized for low-latency chat while retaining strong general intelligence. It uses adaptive reasoning to selectively “think” on...

VisionTools / functionsWeb search
128K ¡ in $1.75 ¡ out $14

GPT-5.2 Pro

Dec 2025

Smartest and most trustworthy option for difficult questions. Uses more compute for harder thinking on complex domains like programming.

VisionReasoningTools / functionsWeb searchImage output
400K ¡ in $21 ¡ out $168

GPT-5.2 Instant

deprecated
Dec 2025

GPT-5.2 Instant model, previously powering ChatGPT. Replaced by GPT-5.5 Instant.

VisionTools / functionsWeb searchImage output
128K ¡ in $1.75 ¡ out $14

GPT-5.2 Codex

deprecated
Dec 2025

GPT-5.2 optimized for long-horizon, agentic coding tasks in Codex or similar environments. Supports low, medium, high, and xhigh reasoning effort settings.

VisionReasoningTools / functionsWeb searchImage output
400K ¡ in $1.75 ¡ out $14

GPT-5.2

Dec 2025

Most capable model for professional work and long-running agents. Improvements in general intelligence, long-context, agentic tool-calling, and vision.

VisionReasoningTools / functionsWeb searchImage output
400K ¡ in $1.75 ¡ out $14

Deep Research Pro Preview

Dec 2025

Preview release (December 12th, 2025) of Deep Research Pro (Version: deepthink-exp-05-20, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens])

VisionReasoning
197K ¡ in $1.25 ¡ out $10

AutoGLM Phone

Dec 2025

Mobile phone automation agent. Understands phone screens via multimodal perception and executes automated operations.

Vision
131K ¡ in - ¡ out -

Devstral 2 (latest)

Dec 2025

Official devstral-2512 Mistral AI model

Tools / functions
262K ¡ in $0.4 ¡ out $2

Devstral 2 (latest)

Dec 2025

Official devstral-2512 Mistral AI model

Tools / functions
262K ¡ in $0.4 ¡ out $2

Devstral 2 (latest)

Dec 2025

Official mistral-medium-latest Mistral AI model

VisionTools / functions
262K ¡ in $0.4 ¡ out $2

Devstral 2 (2512)

Dec 2025

Official devstral-2512 Mistral AI model

Tools / functions
262K ¡ in $0.4 ¡ out $2

Relace: Relace Search

Dec 2025

The relace-search model uses 4-12 `view_file` and `grep` tools in parallel to explore a codebase and return relevant files to the user request. In contrast to RAG, relace-search performs agentic...

Tools / functionsWeb search
256K ¡ in $1 ¡ out $3

GLM-4.6 V FlashX

Dec 2025

Fast vision GLM-4.6 with priority routing and higher concurrency. Image/video/file inputs, 32K output.

VisionReasoningTools / functions
131K ¡ in $0.04 ¡ out $0.4

GLM-4.6 V Flash (Free)

Dec 2025

Free vision GLM-4.6. Same model as FlashX but with limited concurrency (1 concurrent request). Image/video/file inputs, 32K output.

VisionReasoningTools / functions
131K ¡ in - ¡ out -

GLM-4.6 V

Dec 2025

Vision-enabled GLM-4.6 model. Supports image/video/file inputs, 32K output, hybrid thinking.

VisionReasoningTools / functions
131K ¡ in $0.3 ¡ out $0.9

Body Builder (beta)

Dec 2025

Transform your natural language requests into structured OpenRouter API request objects. Describe what you want to accomplish with AI models, and Body Builder will construct the appropriate API calls. Example:...

Web search
128K ¡ in - ¡ out -

mistral-large-latest

Dec 2025

Official mistral-large-2512 Mistral AI model

VisionTools / functions
262K ¡ in $0.5 ¡ out $1.5

Mistral Large (2512)

Dec 2025

Official mistral-large-2512 Mistral AI model

VisionTools / functions
262K ¡ in $0.5 ¡ out $1.5

ministral-8b-latest

Dec 2025

Ministral 3 (a.k.a. Tinystral) 8B Instruct.

VisionTools / functions
262K ¡ in $0.15 ¡ out $0.15

ministral-3b-latest

Dec 2025

Ministral 3 (a.k.a. Tinystral) 3B Instruct.

VisionTools / functions
131K ¡ in $0.1 ¡ out $0.1

ministral-14b-latest

Dec 2025

Ministral 3 (a.k.a. Tinystral) 14B Instruct.

VisionTools / functions
262K ¡ in $0.2 ¡ out $0.2

Ministral 8b (2512)

Dec 2025

Ministral 3 (a.k.a. Tinystral) 8B Instruct.

VisionTools / functions
262K ¡ in $0.15 ¡ out $0.15

Ministral 3b (2512)

Dec 2025

Ministral 3 (a.k.a. Tinystral) 3B Instruct.

VisionTools / functions
131K ¡ in $0.1 ¡ out $0.1

Ministral 14b (2512)

Dec 2025

Ministral 3 (a.k.a. Tinystral) 14B Instruct.

VisionTools / functions
262K ¡ in $0.2 ¡ out $0.2

Amazon: Nova 2 Lite

Dec 2025

Nova 2 Lite is a fast, cost-effective reasoning model for everyday workloads that can process text, images, and videos to generate text. Nova 2 Lite demonstrates standout capabilities in processing...

VisionReasoningTools / functionsWeb search
1M ¡ in $0.3 ¡ out $2.5

Arcee AI: Trinity Mini

Dec 2025

Trinity Mini is a 26B-parameter (3B active) sparse mixture-of-experts language model featuring 128 experts with 8 active per token. Engineered for efficient reasoning over long contexts (131k) with robust function...

ReasoningTools / functionsWeb search
131K ¡ in $0.05 ¡ out $0.15

Claude Opus 4.5

Nov 2025

Previous most intelligent model with advanced reasoning for complex agentic workflows

VisionTools / functionsWeb search
200K ¡ in $5 ¡ out $25

AllenAI: Olmo 3 32B Think

Nov 2025

Olmo 3 32B Think is a large-scale, 32-billion-parameter model purpose-built for deep reasoning, complex logic chains and advanced instruction-following scenarios. Its capacity enables strong performance on demanding evaluation tasks and...

ReasoningWeb search
66K ¡ in $0.15 ¡ out $0.5

Nano Banana Pro Preview

Nov 2025

Gemini 3 Pro Image Preview (Version: 3.0, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,batchGenerateContent])

VisionReasoningTools / functionsWeb searchImage output
164K ¡ in $2 ¡ out $12

Nano Banana Pro

Nov 2025

Gemini 3 Pro Image Preview (Version: 3.0, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,batchGenerateContent])

VisionReasoningTools / functionsWeb searchImage output
164K ¡ in $2 ¡ out $12

GPT-5.1 Codex Max

deprecated
Nov 2025

Our most intelligent coding model optimized for long-horizon, agentic coding tasks.

VisionReasoningTools / functionsWeb searchImage output
400K ¡ in $1.25 ¡ out $10

GPT-5.1 Codex Mini

deprecated
Nov 2025

Smaller, faster version of GPT-5.1 Codex for efficient coding tasks.

VisionReasoningTools / functionsWeb searchImage output
400K ¡ in $0.25 ¡ out $2

GPT-5.1 Codex

deprecated
Nov 2025

A version of GPT-5.1 optimized for agentic coding tasks in Codex or similar environments.

VisionReasoningTools / functionsWeb searchImage output
400K ¡ in $1.25 ¡ out $10

GPT-5.1

Nov 2025

The best model for coding and agentic tasks with configurable reasoning effort.

VisionReasoningTools / functionsWeb searchImage output
400K ¡ in $1.25 ¡ out $10

Deep Cogito: Cogito v2.1 671B

Nov 2025

Cogito v2.1 671B MoE represents one of the strongest open models globally, matching performance of frontier closed and open models. This model is trained using self play with reinforcement learning...

ReasoningWeb search
128K ¡ in $1.25 ¡ out $1.25

OpenAI: GPT-5.1 Chat

Nov 2025

GPT-5.1 Chat (AKA Instant is the fast, lightweight member of the 5.1 family, optimized for low-latency chat while retaining strong general intelligence. It uses adaptive reasoning to selectively “think” on...

VisionReasoningTools / functionsWeb search
128K ¡ in $1.25 ¡ out $10

GPT-5.1 Instant

deprecated
Nov 2025

GPT-5.1 Instant with adaptive reasoning. More conversational with improved instruction following.

VisionReasoningTools / functionsWeb searchImage output
128K ¡ in $1.25 ¡ out $10

MoonshotAI: Kimi K2 Thinking

Nov 2025

Kimi K2 Thinking is Moonshot AI’s most advanced open reasoning model to date, extending the K2 series into agentic, long-horizon reasoning. Built on the trillion-parameter Mixture-of-Experts (MoE) architecture introduced in...

ReasoningTools / functionsWeb search
262K ¡ in $0.6 ¡ out $2.5

Amazon: Nova Premier 1.0

Oct 2025

Amazon Nova Premier is the most capable of Amazon’s multimodal models for complex reasoning tasks and for use as the best teacher for distilling custom models.

VisionTools / functionsWeb search
1M ¡ in $2.5 ¡ out $12.5

Perplexity: Sonar Pro Search

Oct 2025

Exclusively available on the OpenRouter API, Sonar Pro's new Pro Search mode is Perplexity's most advanced agentic search system. It is designed for deeper reasoning and analysis. Pricing is based...

VisionReasoningWeb search
200K ¡ in $3 ¡ out $15

Mistral: Voxtral Small 24B 2507

Oct 2025

Voxtral Small is an enhancement of Mistral Small 3, incorporating state-of-the-art audio input capabilities while retaining best-in-class text performance. It excels at speech transcription, translation and audio understanding. Input audio...

Tools / functionsWeb search
32K ¡ in $0.1 ¡ out $0.3

[OpenAI] GPT OSS Safeguard 20B (Preview)

Oct 2025

OpenAI safety classification model (20B MoE). Purpose-built for content moderation with Harmony response format. 131K context, 65K max output. ~1000 t/s on Groq.

Tools / functions
131K ¡ in $0.08 ¡ out $0.3

NVIDIA: Nemotron Nano 12B 2 VL (free) · 🎁

Oct 2025

NVIDIA Nemotron Nano 2 VL is a 12-billion-parameter open multimodal reasoning model designed for video understanding and document intelligence. It introduces a hybrid Transformer-Mamba architecture, combining transformer-level accuracy with Mamba’s...

VisionReasoningTools / functionsWeb search
128K ¡ in - ¡ out -

Qwen: Qwen3 VL 32B Instruct

Oct 2025

Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-precision understanding and reasoning across text, images, and video. With 32 billion parameters, it combines deep visual perception with advanced text...

VisionTools / functionsWeb search
262K ¡ in $0.1 ¡ out $0.42

MiniMax: MiniMax M2

Oct 2025

MiniMax-M2 is a compact, high-efficiency large language model optimized for end-to-end coding and agentic workflows. With 10 billion activated parameters (230 billion total), it delivers near-frontier intelligence across general reasoning,...

ReasoningTools / functionsWeb search
205K ¡ in $0.26 ¡ out $1

IBM: Granite 4.0 Micro

Oct 2025

Granite-4.0-H-Micro is a 3B parameter from the Granite 4 family of models. These models are the latest in a series of models released by IBM. They are fine-tuned for long...

Web search
131K ¡ in $0.02 ¡ out $0.11

Microsoft: Phi 4 Mini Instruct

Oct 2025

Phi-4-mini-instruct is a lightweight open model built upon synthetic data and filtered publicly available websites - with a focus on high-quality, reasoning dense data. The model belongs to the Phi-4...

Web search
131K ¡ in $0.08 ¡ out $0.35

Claude Haiku 4.5

Oct 2025

Fastest model with exceptional speed and performance

VisionTools / functionsWeb search
200K ¡ in $1 ¡ out $5

Qwen: Qwen3 VL 8B Thinking

Oct 2025

Qwen3-VL-8B-Thinking is the reasoning-optimized variant of the Qwen3-VL-8B multimodal model, designed for advanced visual and textual reasoning across complex scenes, documents, and temporal sequences. It integrates enhanced multimodal alignment and...

VisionReasoningTools / functionsWeb search
256K ¡ in $0.12 ¡ out $1.37

Qwen: Qwen3 VL 8B Instruct

Oct 2025

Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for high-fidelity understanding and reasoning across text, images, and video. It features improved multimodal fusion with Interleaved-MRoPE for long-horizon...

VisionTools / functionsWeb search
256K ¡ in $0.08 ¡ out $0.5

GPT-5 Search API

Oct 2025

Updated web search model in Chat Completions API. 60% cheaper with domain filtering support.

VisionWeb search
400K ¡ in $1.25 ¡ out $10

NVIDIA: Llama 3.3 Nemotron Super 49B V1.5

Oct 2025

Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model derived from Meta’s Llama-3.3-70B-Instruct with a 128K context. It’s post-trained for agentic workflows (RAG, tool calling) via SFT across math, code, science, and...

ReasoningTools / functionsWeb search
131K ¡ in $0.4 ¡ out $0.4

Gemini 2.5 Computer Use Preview 10-2025

Oct 2025

Gemini 2.5 Computer Use Preview 10-2025 (Version: Gemini 2.5 Computer Use Preview 10-2025, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens])

VisionReasoningTools / functions
197K ¡ in $1.25 ¡ out $10

Qwen: Qwen3 VL 30B A3B Thinking

Oct 2025

Qwen3-VL-30B-A3B-Thinking is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Thinking variant enhances reasoning in STEM, math, and complex tasks. It excels...

VisionReasoningTools / functionsWeb search
131K ¡ in $0.13 ¡ out $1.56

Qwen: Qwen3 VL 30B A3B Instruct

Oct 2025

Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Instruct variant optimizes instruction-following for general multimodal tasks. It excels in perception...

VisionTools / functionsWeb search
262K ¡ in $0.13 ¡ out $0.52

GPT-5 Pro

Oct 2025

Version of GPT-5 that uses more compute to produce smarter and more precise responses. Designed for tough problems.

VisionReasoningTools / functionsWeb searchImage output
400K ¡ in $15 ¡ out $120

Nano Banana

Oct 2025

Gemini 2.5 Flash Preview Image (Version: 2.0, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,batchGenerateContent])

VisionImage output
66K ¡ in $0.3 ¡ out $2.5

GLM-4.6

Sep 2025

GLM-4.6 model with 128K context/output. Hybrid thinking: auto-determines whether to engage deep reasoning.

ReasoningTools / functions
131K ¡ in $0.6 ¡ out $2.2

DeepSeek: DeepSeek V3.2 Exp

Sep 2025

DeepSeek-V3.2-Exp is an experimental large language model released by DeepSeek as an intermediate step between V3.1 and future architectures. It introduces DeepSeek Sparse Attention (DSA), a fine-grained sparse attention mechanism...

ReasoningTools / functionsWeb search
164K ¡ in $0.27 ¡ out $0.41

Claude Sonnet 4.5

Sep 2025

Previous best combination of speed and intelligence for complex agents and coding

VisionTools / functionsWeb search
200K ¡ in $3 ¡ out $15

TheDrummer: Cydonia 24B V4.1

Sep 2025

Uncensored and creative writing model based on Mistral Small 3.2 24B with good recall, prompt adherence, and intelligence.

Web search
131K ¡ in $0.3 ¡ out $0.5

Relace: Relace Apply 3

Sep 2025

Relace Apply 3 is a specialized code-patching LLM that merges AI-suggested edits straight into your source files. It can apply updates from GPT-4o, Claude, and others into your files at...

Web search
256K ¡ in $0.85 ¡ out $1.25

Google: Gemini 2.5 Flash Lite Preview 09-2025

Sep 2025

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance...

VisionReasoningTools / functionsWeb search
1M ¡ in $0.1 ¡ out $0.4

Qwen3 Vl 235b A22b Thinking

Sep 2025

Alibaba model (not yet curated).

Tools / functions
131K ¡ in $0.26 ¡ out $2.6

Qwen3 Vl 235b A22b Instruct

Sep 2025

Alibaba model (not yet curated).

Tools / functions
131K ¡ in $0.2 ¡ out $0.88

Qwen3 Max

Sep 2025

Alibaba model (not yet curated).

Tools / functions
131K ¡ in $0.78 ¡ out $3.9

DeepSeek: DeepSeek V3.1 Terminus

Sep 2025

DeepSeek-V3.1 Terminus is an update to [DeepSeek V3.1](/deepseek/deepseek-chat-v3.1) that maintains the model's original capabilities while addressing issues reported by users, including language consistency and agent capabilities, further optimizing the model's...

ReasoningTools / functionsWeb search
164K ¡ in $0.27 ¡ out $0.95

Qwen3 Coder Flash

Sep 2025

Alibaba model (not yet curated).

Tools / functions
131K ¡ in $0.2 ¡ out $0.98

magistral-small-latest

Sep 2025

Mistral Small 4.

VisionReasoningTools / functions
262K ¡ in $0.5 ¡ out $1.5

magistral-medium-latest

Sep 2025

Our frontier-class reasoning model release candidate September 2025.

VisionReasoningTools / functions
131K ¡ in $2 ¡ out $5

Magistral Small (2509)

Sep 2025

Our efficient reasoning model released September 2025.

VisionReasoningTools / functions
131K ¡ in $0.5 ¡ out $1.5

Magistral Medium (2509)

Sep 2025

Our frontier-class reasoning model release candidate September 2025.

VisionReasoningTools / functions
131K ¡ in $2 ¡ out $5

GPT-5 Codex

deprecated
Sep 2025

A version of GPT-5 optimized for agentic coding in Codex.

VisionReasoningTools / functionsWeb search
400K ¡ in $1.25 ¡ out $10

Qwen3 Next 80b A3b Thinking

Sep 2025

Alibaba model (not yet curated).

Tools / functions
131K ¡ in $0.1 ¡ out $0.78

Qwen3 Next 80b A3b Instruct

Sep 2025

Alibaba model (not yet curated).

Tools / functions
131K ¡ in $0.09 ¡ out $1.1

Qwen Plus

Sep 2025

Balanced quality, speed, and cost with hybrid thinking. 1M context.

ReasoningTools / functions
1M ¡ in $0.4 ¡ out $1.2

NVIDIA: Nemotron Nano 9B V2 (free) · 🎁

Sep 2025

NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDIA, and designed as a unified model for both reasoning and non-reasoning tasks. It responds to user queries and...

ReasoningTools / functionsWeb search
128K ¡ in - ¡ out -

MoonshotAI: Kimi K2 0905

Sep 2025

Kimi K2 0905 is the September update of [Kimi K2 0711](moonshotai/kimi-k2). It is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion total parameters with 32...

Tools / functionsWeb search
262K ¡ in $0.6 ¡ out $2.5

[Groq] Compound Mini (Agentic System)

Sep 2025

Lighter Groq agentic AI with web search, code execution. Pricing based on underlying model usage.

Tools / functions
131K ¡ in - ¡ out -

[Groq] Compound (Agentic System)

Sep 2025

Groq agentic AI with web search, code execution, browser automation. Uses GPT-OSS 120B, Llama 4 Scout, Llama 3.3 70B. Pricing based on underlying model usage.

Tools / functions
131K ¡ in - ¡ out -

Qwen3 30b A3b Thinking 2507

Aug 2025

Alibaba model (not yet curated).

Tools / functions
131K ¡ in $0.08 ¡ out $0.4

GPT Audio

Aug 2025

First generally available audio model. Accepts audio inputs and outputs, and can be used in the Chat Completions REST API.

Audio output
128K ¡ in $2.5 ¡ out $10

Nous: Hermes 4 70B

Aug 2025

Hermes 4 70B is a hybrid reasoning model from Nous Research, built on Meta-Llama-3.1-70B. It introduces the same hybrid mode as the larger 405B release, allowing the model to either...

ReasoningWeb search
131K ¡ in $0.13 ¡ out $0.4

Nous: Hermes 4 405B

Aug 2025

Hermes 4 is a large-scale reasoning model built on Meta-Llama-3.1-405B and released by Nous Research. It introduces a hybrid reasoning mode, where the model can choose to deliberate internally with...

ReasoningWeb search
131K ¡ in $1 ¡ out $3

DeepSeek: DeepSeek V3.1

Aug 2025

DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes via prompt templates. It extends the DeepSeek-V3 base with a two-phase long-context...

ReasoningTools / functionsWeb search
164K ¡ in $0.21 ¡ out $0.79

Mistral: Mistral Medium 3.1

Aug 2025

Mistral Medium 3.1 is an updated version of Mistral Medium 3, which is a high-performance enterprise-grade language model designed to deliver frontier-level capabilities at significantly reduced operational cost. It balances...

VisionTools / functionsWeb search
131K ¡ in $0.4 ¡ out $2

Mistral Medium (2508)

Aug 2025

Update on Mistral Medium 3 with improved capabilities.

VisionTools / functions
131K ¡ in $0.4 ¡ out $2

GLM-4.5 V

Aug 2025

Vision-enabled GLM-4.5 model. 96K context, 16K output, interleaved thinking.

VisionReasoningTools / functions
98K ¡ in $0.6 ¡ out $1.8

AI21: Jamba Large 1.7

Aug 2025

Jamba Large 1.7 is the latest model in the Jamba open family, offering improvements in grounding, instruction-following, and overall efficiency. Built on a hybrid SSM-Transformer architecture with a 256K context...

Tools / functionsWeb search
256K ¡ in $2 ¡ out $8

OpenAI: GPT-5 Chat

Aug 2025

GPT-5 Chat is designed for advanced, natural, multimodal, and context-aware conversations for enterprise applications.

VisionTools / functionsWeb search
128K ¡ in $1.25 ¡ out $10

GPT-5 Nano

Aug 2025

Fastest, most cost-efficient version of GPT-5 for summarization and classification tasks.

VisionReasoningTools / functionsWeb searchImage output
400K ¡ in $0.05 ¡ out $0.4

GPT-5 Mini

Aug 2025

A faster, more cost-efficient version of GPT-5 for well-defined tasks.

VisionReasoningTools / functionsWeb searchImage output
400K ¡ in $0.25 ¡ out $2

GPT-5 ChatGPT

deprecated
Aug 2025

GPT-5 model used in ChatGPT.

VisionTools / functionsWeb searchImage output
128K ¡ in $1.25 ¡ out $10

GPT-5

Aug 2025

The best model for coding and agentic tasks across domains.

VisionReasoningTools / functionsWeb searchImage output
400K ¡ in $1.25 ¡ out $10

GPT OSS 120B

Aug 2025

OpenAI flagship open-weight MoE (120B total, 5.1B active) on Cerebras (~3,000 tok/s). Reasoning (default medium effort) and function calling. 131K context, 40K max output.

ReasoningTools / functions
131K ¡ in $0.35 ¡ out $0.75

Claude Opus 4.1

deprecated
Aug 2025

Previous Opus model. Deprecated June 5, 2026, retiring August 5, 2026.

VisionTools / functionsWeb search
200K ¡ in $15 ¡ out $75

[OpenAI] GPT OSS 20B

Aug 2025

OpenAI efficient open-weight MoE (20B total, 3.6B active). Tool use, browser search, code execution. 131K context, 65K max output. ~1000 t/s on Groq.

Tools / functions
131K ¡ in $0.08 ¡ out $0.3

Qwen: Qwen3 Coder 30B A3B Instruct

Jul 2025

Qwen3-Coder-30B-A3B-Instruct is a 30.5B parameter Mixture-of-Experts (MoE) model with 128 experts (8 active per forward pass), designed for advanced code generation, repository-scale understanding, and agentic tool use. Built on the...

Tools / functionsWeb search
160K ¡ in $0.07 ¡ out $0.27

codestral-latest

Jul 2025

Our cutting-edge language model for coding released August 2025.

Tools / functions
256K ¡ in $0.3 ¡ out $0.9

Codestral (2508)

Jul 2025

Our cutting-edge language model for coding released August 2025.

Tools / functions
256K ¡ in $0.3 ¡ out $0.9

Qwen3 30b A3b Instruct 2507

Jul 2025

Alibaba model (not yet curated).

Tools / functions
131K ¡ in $0.05 ¡ out $0.19

GLM-4.5 X

Jul 2025

Extended GLM-4.5 model. Interleaved thinking.

ReasoningTools / functions
98K ¡ in $2.2 ¡ out $8.9

GLM-4.5 Flash (Free)

Jul 2025

Free GLM-4.5 variant with limited concurrency. Prior-gen, superseded by GLM-4.7 Flash.

ReasoningTools / functions
98K ¡ in - ¡ out -

GLM-4.5 AirX

Jul 2025

Extended lightweight GLM-4.5 variant. Interleaved thinking.

ReasoningTools / functions
98K ¡ in $1.1 ¡ out $4.5

GLM-4.5 Air

Jul 2025

Lightweight GLM-4.5 variant. Interleaved thinking.

ReasoningTools / functions
98K ¡ in $0.2 ¡ out $1.1

GLM-4.5

Jul 2025

Prior-gen GLM-4.5 model with 96K context/output. Interleaved thinking.

ReasoningTools / functions
98K ¡ in $0.6 ¡ out $2.2

Qwen3 235b A22b Thinking 2507

Jul 2025

Alibaba model (not yet curated).

Tools / functions
131K ¡ in $0.15 ¡ out $1.5

Qwen: Qwen3 Coder 480B A35B

Jul 2025

Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team. It is optimized for agentic coding tasks such as function calling, tool use, and long-context reasoning over...

Tools / functionsWeb search
1M ¡ in $0.22 ¡ out $1.8

Gemini 2.5 Flash-Lite

Jul 2025

Stable version of Gemini 2.5 Flash-Lite, released in July of 2025 (Version: 001, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,createCachedContent,batchGenerateContent])

VisionReasoningTools / functionsWeb search
1.1M ¡ in $0.1 ¡ out $0.4

ByteDance: UI-TARS 7B

Jul 2025

UI-TARS-1.5 is a multimodal vision-language agent optimized for GUI-based environments, including desktop interfaces, web browsers, mobile systems, and games. Built by ByteDance, it builds upon the UI-TARS framework with reinforcement...

VisionWeb search
128K ¡ in $0.1 ¡ out $0.2

Qwen: Qwen3 235B A22B Instruct 2507

Jul 2025

Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-experts language model based on the Qwen3-235B architecture, with 22B active parameters per forward pass. It is optimized for general-purpose text generation, including instruction following,...

Tools / functionsWeb search
262K ¡ in $0.09 ¡ out $0.1

voxtral-small-latest

Jul 2025

A small audio understanding model released in July 2025

Tools / functions
33K ¡ in $0.1 ¡ out $0.3

voxtral-mini-latest

Jul 2025

A mini audio understanding model released in July 2025

33K ¡ in $0.04 ¡ out $0.04

Voxtral Small (2507)

Jul 2025

A small audio understanding model released in July 2025

Tools / functions
33K ¡ in $0.1 ¡ out $0.3

Voxtral Mini (2507)

Jul 2025

A mini audio understanding model released in July 2025

33K ¡ in $0.04 ¡ out $0.04

Switchpoint Router

Jul 2025

Switchpoint AI's router instantly analyzes your request and directs it to the optimal AI from an ever-evolving library. As the world of LLMs advances, our router gets smarter, ensuring you...

ReasoningWeb search
131K ¡ in $0.85 ¡ out $3.4

MoonshotAI: Kimi K2 0711

Jul 2025

Kimi K2 Instruct is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion total parameters with 32 billion active per forward pass. It is optimized for...

Tools / functionsWeb search
131K ¡ in $0.57 ¡ out $2.3

Venice: Uncensored (free) · 🎁

Jul 2025

Venice Uncensored Dolphin Mistral 24B Venice Edition is a fine-tuned variant of Mistral-Small-24B-Instruct-2501, developed by dphn.ai in collaboration with Venice.ai. This model is designed as an “uncensored” instruct-tuned LLM, preserving...

Web search
33K ¡ in - ¡ out -

Tencent: Hunyuan A13B Instruct

Jul 2025

Hunyuan-A13B is a 13B active parameter Mixture-of-Experts (MoE) language model developed by Tencent, with a total parameter count of 80B and support for reasoning via Chain-of-Thought. It offers competitive benchmark...

ReasoningWeb search
131K ¡ in $0.14 ¡ out $0.57

Morph: Morph V3 Large

Jul 2025

Morph's high-accuracy apply model for complex code edits. ~4,500 tokens/sec with 98% accuracy for precise code transformations. The model requires the prompt to be in the following format: <instruction>{instruction}</instruction> <code>{initial_code}</code>...

Web search
262K ¡ in $0.9 ¡ out $1.9

Morph: Morph V3 Fast

Jul 2025

Morph's fastest apply model for code edits. ~10,500 tokens/sec with 96% accuracy for rapid code transformations. The model requires the prompt to be in the following format: <instruction>{instruction}</instruction> <code>{initial_code}</code> <update>{edit_snippet}</update>...

Web search
82K ¡ in $0.8 ¡ out $1.2

Baidu: ERNIE 4.5 VL 424B A47B

Jun 2025

ERNIE-4.5-VL-424B-A47B is a multimodal Mixture-of-Experts (MoE) model from Baidu’s ERNIE 4.5 series, featuring 424B total parameters with 47B active per token. It is trained jointly on text and image data...

VisionReasoningWeb search
131K ¡ in $0.42 ¡ out $1.25

o4 Mini Deep Research

deprecated
Jun 2025

Faster, more affordable deep research model for complex, multi-step research tasks.

VisionReasoningTools / functionsWeb search
200K ¡ in $2 ¡ out $8

o3 Deep Research

deprecated
Jun 2025

Our most powerful deep research model for complex, multi-step research tasks.

VisionReasoningTools / functionsWeb search
200K ¡ in $10 ¡ out $40

Mistral: Mistral Small 3.2 24B

Jun 2025

Mistral-Small-3.2-24B-Instruct-2506 is an updated 24B parameter model from Mistral optimized for instruction following, repetition reduction, and improved function calling. Compared to the 3.1 release, version 3.2 significantly improves accuracy on...

VisionTools / functionsWeb search
128K ¡ in $0.08 ¡ out $0.2

Mistral Small (2506)

Jun 2025

Our latest enterprise-grade small model with the latest version released June 2025.

VisionTools / functions
131K ¡ in $0.1 ¡ out $0.3

MiniMax: MiniMax M1

Jun 2025

MiniMax-M1 is a large-scale, open-weight reasoning model designed for extended context and high-efficiency inference. It leverages a hybrid Mixture-of-Experts (MoE) architecture paired with a custom "lightning attention" mechanism, allowing it...

ReasoningTools / functionsWeb search
1M ¡ in $0.4 ¡ out $2.2

Gemini 2.5 Pro

Jun 2025

Stable release (June 17th, 2025) of Gemini 2.5 Pro (Version: 2.5, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,createCachedContent,batchGenerateContent])

VisionReasoningTools / functionsWeb search
1.1M ¡ in $1.25 ¡ out $10

Gemini 2.5 Flash

Jun 2025

Stable version of Gemini 2.5 Flash, our mid-size multimodal model that supports up to 1 million tokens, released in June of 2025. (Version: 001, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[generateContent,countTokens,createCachedContent,batchGenerateContent])

VisionReasoningTools / functionsWeb search
1.1M ¡ in $0.3 ¡ out $2.5

o3 Pro

Jun 2025

Version of o3 with more compute for better responses. Provides consistently better answers for complex tasks.

VisionReasoningTools / functionsWeb searchImage output
200K ¡ in $20 ¡ out $80

DeepSeek: R1 0528

May 2025

May 28th update to the [original DeepSeek R1](/deepseek/deepseek-r1) Performance on par with [OpenAI o1](/openai/o1), but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active...

ReasoningTools / functionsWeb search
164K ¡ in $0.5 ¡ out $2.15

Anthropic: Claude Sonnet 4

May 2025

Claude Sonnet 4 significantly enhances the capabilities of its predecessor, Sonnet 3.7, excelling in both coding and reasoning tasks with improved precision and controllability. Achieving state-of-the-art performance on SWE-bench (72.7%),...

VisionTools / functionsWeb search
1M ¡ in $3 ¡ out $15

Anthropic: Claude Opus 4

May 2025

Claude Opus 4 is benchmarked as the world’s best coding model, at time of release, bringing sustained performance on complex, long-running tasks and agent workflows. It sets new benchmarks in...

VisionTools / functionsWeb search
200K ¡ in $15 ¡ out $75

Google: Gemma 3n 4B

May 2025

Gemma 3n E4B-it is optimized for efficient execution on mobile and low-resource devices, such as phones, laptops, and tablets. It supports multimodal inputs—including text, visual data, and audio—enabling diverse tasks...

Web search
33K ¡ in $0.06 ¡ out $0.12

Google: Gemini 2.5 Pro Preview 06-05

May 2025

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through responses with enhanced accuracy...

VisionReasoningTools / functionsWeb search
1M ¡ in $1.25 ¡ out $10

Gemini 2.5 Pro Preview TTS

May 2025

Gemini 2.5 Pro Preview TTS (Version: gemini-2.5-pro-preview-tts-2025-05-19, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[countTokens,generateContent,batchGenerateContent])

VisionAudio output
25K ¡ in $1 ¡ out -

Gemini 2.5 Flash Preview TTS

May 2025

Gemini 2.5 Flash Preview TTS (Version: gemini-2.5-flash-exp-tts-2025-05-19, Defaults: temperature=1, topP=0.95, topK=64, interfaces=[countTokens,generateContent])

VisionAudio output
25K ¡ in $0.5 ¡ out -

mistral-medium-3

May 2025

Official mistral-medium-latest Mistral AI model

VisionTools / functions
262K ¡ in $0.4 ¡ out $2

Mistral Medium (2505)

May 2025

Our frontier-class multimodal model released May 2025.

VisionTools / functions
131K ¡ in $0.4 ¡ out $2

Google: Gemini 2.5 Pro Preview 05-06

May 2025

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through responses with enhanced accuracy...

VisionReasoningTools / functionsWeb search
1M ¡ in $1.25 ¡ out $10

Arcee AI: Virtuoso Large

May 2025

Virtuoso‑Large is Arcee's top‑tier general‑purpose LLM at 72 B parameters, tuned to tackle cross‑domain reasoning, creative writing and enterprise QA. Unlike many 70 B peers, it retains the 128 k...

Tools / functionsWeb search
131K ¡ in $0.75 ¡ out $1.2

Arcee AI: Coder Large

May 2025

Coder‑Large is a 32 B‑parameter offspring of Qwen 2.5‑Instruct that has been further trained on permissively‑licensed GitHub, CodeSearchNet and synthetic bug‑fix corpora. It supports a 32k context window, enabling multi‑file...

Web search
33K ¡ in $0.5 ¡ out $0.8

Meta: Llama Guard 4 12B

Apr 2025

Llama Guard 4 is a Llama 4 Scout-derived multimodal pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used to classify content in both LLM...

VisionWeb search
164K ¡ in $0.18 ¡ out $0.18

Qwen3 8b

Apr 2025

Alibaba model (not yet curated).

Tools / functions
131K ¡ in $0.05 ¡ out $0.4

Qwen3 32b

Apr 2025

Alibaba model (not yet curated).

Tools / functions
131K ¡ in $0.29 ¡ out $0.59

Qwen3 30b A3b

Apr 2025

Alibaba model (not yet curated).

Tools / functions
131K ¡ in $0.12 ¡ out $0.5

Qwen3 235b A22b

Apr 2025

Alibaba model (not yet curated).

Tools / functions
131K ¡ in $0.46 ¡ out $1.82

Qwen3 14b

Apr 2025

Alibaba model (not yet curated).

Tools / functions
131K ¡ in $0.1 ¡ out $0.24

o4 Mini

deprecated
Apr 2025

Latest o4-mini model. Optimized for fast, effective reasoning with exceptionally efficient performance in coding and visual tasks.

VisionReasoningTools / functions
200K ¡ in $1.1 ¡ out $4.4

o3

Apr 2025

A well-rounded and powerful model across domains. Sets a new standard for math, science, coding, and visual reasoning tasks.

VisionReasoningTools / functions
200K ¡ in $2 ¡ out $8

GPT-4.1 Nano

Apr 2025

Fastest, most cost-effective GPT 4.1 model. Delivers exceptional performance with low latency, ideal for tasks like classification or autocompletion.

VisionTools / functions
1M ¡ in $0.1 ¡ out $0.4

GPT-4.1 Mini

Apr 2025

Balanced for intelligence, speed, and cost. Matches or exceeds GPT-4o in intelligence while reducing latency by nearly half and cost by 83%.

VisionTools / functions
1M ¡ in $0.4 ¡ out $1.6

GPT-4.1

Apr 2025

Flagship GPT model for complex tasks. Major improvements on coding, instruction following, and long context with 1M token context window.

VisionTools / functions
1M ¡ in $2 ¡ out $8

GLM-4 32B (0414) 128K

Apr 2025

GLM-4 32B model with 128K context, 16K output.

Tools / functions
131K ¡ in $0.1 ¡ out $0.1

Meta: Llama 4 Scout

Apr 2025

Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model developed by Meta, activating 17 billion parameters out of a total of 109B. It supports native multimodal input...

VisionTools / functionsWeb search
10M ¡ in $0.1 ¡ out $0.3

Meta: Llama 4 Maverick

Apr 2025

Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language model from Meta, built on a mixture-of-experts (MoE) architecture with 128 experts and 17 billion active parameters per forward...

VisionTools / functionsWeb search
1M ¡ in $0.15 ¡ out $0.6

[Meta] Llama 4 Scout · 17B × 16E (Preview)

Apr 2025

Llama 4 Scout 17B MoE with 16 experts (109B total params), native multimodal with vision support. 131K context, 8K max output. ~750 t/s on Groq.

Tools / functions
131K ¡ in $0.11 ¡ out $0.34

DeepSeek: DeepSeek V3 0324

Mar 2025

DeepSeek V3, a 685B-parameter, mixture-of-experts model, is the latest iteration of the flagship chat model family from the DeepSeek team. It succeeds the [DeepSeek V3](/deepseek/deepseek-chat-v3) model and performs really well...

Tools / functionsWeb search
164K ¡ in $0.2 ¡ out $0.77

o1 Pro

Mar 2025

A version of o1 with more compute for better responses. Provides consistently better answers for complex tasks.

VisionReasoningTools / functions
200K ¡ in $150 ¡ out $600

Mistral: Mistral Small 3.1 24B

Mar 2025

Mistral Small 3.1 24B Instruct is an upgraded variant of Mistral Small 3 (2501), featuring 24 billion parameters with advanced multimodal capabilities. It provides state-of-the-art performance in text-based reasoning and...

VisionWeb search
128K ¡ in $0.35 ¡ out $0.56

Google: Gemma 3 4B

Mar 2025

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities,...

VisionWeb search
131K ¡ in $0.05 ¡ out $0.1

Google: Gemma 3 12B

Mar 2025

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities,...

VisionTools / functionsWeb search
131K ¡ in $0.05 ¡ out $0.15

Cohere: Command A

Mar 2025

Command A is an open-weights 111B parameter model with a 256k context window focused on delivering great performance across agentic, multilingual, and coding use cases. Compared to other leading proprietary...

Web search
256K ¡ in $2.5 ¡ out $10

Reka Flash 3

Mar 2025

Reka Flash 3 is a general-purpose, instruction-tuned large language model with 21 billion parameters, developed by Reka. It excels at general chat, coding tasks, instruction-following, and function calling. Featuring a...

ReasoningWeb search
66K ¡ in $0.1 ¡ out $0.2

Google: Gemma 3 27B

Mar 2025

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities,...

VisionTools / functionsWeb search
131K ¡ in $0.08 ¡ out $0.16

GPT-4o Search Preview

Mar 2025

Latest snapshot of the GPT-4o model optimized for web search capabilities.

Web search
128K ¡ in $2.5 ¡ out $10

GPT-4o Mini Search Preview

deprecated
Mar 2025

Latest snapshot of the GPT-4o Mini model optimized for web search capabilities.

Web search
128K ¡ in $0.15 ¡ out $0.6

TheDrummer: Skyfall 36B V2

Mar 2025

Skyfall 36B v2 is an enhanced iteration of Mistral Small 2501, specifically fine-tuned for improved creativity, nuanced writing, role-playing, and coherent storytelling.

Web search
33K ¡ in $0.55 ¡ out $0.8

Perplexity: Sonar Reasoning Pro

Mar 2025

Note: Sonar Pro pricing includes Perplexity search pricing. See [details here](https://docs.perplexity.ai/guides/pricing#detailed-pricing-breakdown-for-sonar-reasoning-pro-and-sonar-pro) Sonar Reasoning Pro is a premier reasoning model powered by DeepSeek R1 with Chain of Thou...

VisionReasoningWeb search
128K ¡ in $2 ¡ out $8

Perplexity: Sonar Pro

Mar 2025

Note: Sonar Pro pricing includes Perplexity search pricing. See [details here](https://docs.perplexity.ai/guides/pricing#detailed-pricing-breakdown-for-sonar-reasoning-pro-and-sonar-pro) For enterprises seeking more advanced capabilities, the Sonar Pro API can handle in-depth,...

VisionWeb search
200K ¡ in $3 ¡ out $15

Perplexity: Sonar Deep Research

Mar 2025

Sonar Deep Research is a research-focused model designed for multi-step retrieval, synthesis, and reasoning across complex topics. It autonomously searches, reads, and evaluates sources, refining its approach as it gathers...

ReasoningWeb search
128K ¡ in $2 ¡ out $8

Mistral: Saba

Feb 2025

Mistral Saba is a 24B-parameter language model specifically designed for the Middle East and South Asia, delivering accurate and contextually relevant responses while maintaining efficient performance. Trained on curated regional...

Tools / functionsWeb search
33K ¡ in $0.2 ¡ out $0.6

Gemini 2.0 Flash 001

Feb 2025

Stable version of Gemini 2.0 Flash, our fast and versatile multimodal model for scaling across diverse tasks, released in January of 2025. (Version: 2.0, Defaults: temperature=1, topP=0.95, topK=40, interfaces=[generateContent,countTokens,createCachedContent,batchGenerateContent])

VisionTools / functions
1.1M ¡ in $0.1 ¡ out $0.4

AionLabs: Aion-RP 1.0 (8B)

Feb 2025

Aion-RP-Llama-3.1-8B ranks the highest in the character evaluation portion of the RPBench-Auto benchmark, a roleplaying-specific variant of Arena-Hard-Auto, where LLMs evaluate each other’s responses. It is a fine-tuned base model...

Web search
33K ¡ in $0.8 ¡ out $1.6

AionLabs: Aion-1.0-Mini

Feb 2025

Aion-1.0-Mini 32B parameter model is a distilled version of the DeepSeek-R1 model, designed for strong performance in reasoning domains such as mathematics, coding, and logic. It is a modified variant...

ReasoningWeb search
131K ¡ in $0.7 ¡ out $1.4

AionLabs: Aion-1.0

Feb 2025

Aion-1.0 is a multi-model system designed for high performance across various tasks, including reasoning and coding. It is built on DeepSeek-R1, augmented with additional models and techniques such as Tree...

ReasoningWeb search
131K ¡ in $4 ¡ out $8

Qwen: Qwen2.5 VL 72B Instruct

Feb 2025

Qwen2.5-VL is proficient in recognizing common objects such as flowers, birds, fish, and insects. It is also highly capable of analyzing texts, charts, icons, graphics, and layouts within images.

VisionWeb search
131K ¡ in $0.8 ¡ out $1

o3 Mini

Jan 2025

Latest o3-mini model snapshot. High intelligence at the same cost and latency targets of o1-mini. Excels at science, math, and coding tasks.

ReasoningTools / functions
200K ¡ in $1.1 ¡ out $4.4

Mistral: Mistral Small 3

Jan 2025

Mistral Small 3 is a 24B-parameter language model optimized for low-latency performance across common AI tasks. Released under the Apache 2.0 license, it features both pre-trained and instruction-tuned versions designed...

Web search
33K ¡ in $0.05 ¡ out $0.08

Perplexity: Sonar

Jan 2025

Sonar is lightweight, affordable, fast, and simple to use — now featuring citations and the ability to customize sources. It is designed for companies seeking to integrate lightweight question-and-answer features...

VisionWeb search
127K ¡ in $1 ¡ out $1

DeepSeek: R1 Distill Llama 70B

Jan 2025

DeepSeek R1 Distill Llama 70B is a distilled large language model based on [Llama-3.3-70B-Instruct](/meta-llama/llama-3.3-70b-instruct), using outputs from [DeepSeek R1](/deepseek/deepseek-r1). The model combines advanced distillation techniques to achieve high performance acr...

ReasoningWeb search
128K ¡ in $0.8 ¡ out $0.8

DeepSeek: R1

Jan 2025

DeepSeek R1 is here: Performance on par with [OpenAI o1](/openai/o1), but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active in an inference pass....

ReasoningTools / functionsWeb search
164K ¡ in $0.7 ¡ out $2.5

V1 8K Vision (Preview)

Jan 2025

Legacy vision model with 8K context. Preview variant - use moonshot-v1-vision for production.

Vision
8K ¡ in $0.2 ¡ out $2

V1 32K Vision (Preview)

Jan 2025

Legacy vision model with 32K context. Preview variant - use moonshot-v1-vision for production.

Vision
33K ¡ in $1 ¡ out $3

V1 128K Vision (Preview)

Jan 2025

Legacy vision model with 128K context. Preview variant - use moonshot-v1-vision for production.

Vision
131K ¡ in $2 ¡ out $5

MiniMax: MiniMax-01

Jan 2025

MiniMax-01 is a combines MiniMax-Text-01 for text generation and MiniMax-VL-01 for image understanding. It has 456 billion parameters, with 45.9 billion parameters activated per inference, and can handle a context...

VisionWeb search
1M ¡ in $0.2 ¡ out $1.1

Microsoft: Phi 4

Jan 2025

(/microsoft) Phi-4 is designed to perform well in complex reasoning tasks and can operate efficiently in situations with limited memory or where quick responses are needed. At 14 billion...

Web search
16K ¡ in $0.07 ¡ out $0.14

Sao10K: Llama 3.1 70B Hanami x1

Jan 2025

This is [Sao10K](/sao10k)'s experiment over [Euryale v2.2](/sao10k/l3.1-euryale-70b).

Web search
16K ¡ in $3 ¡ out $3

DeepSeek: DeepSeek V3

Dec 2024

DeepSeek-V3 is the latest model from the DeepSeek team, building upon the instruction following and coding abilities of the previous versions. Pre-trained on nearly 15 trillion tokens, the reported evaluations...

Tools / functionsWeb search
131K ¡ in $0.2 ¡ out $0.8

Sao10K: Llama 3.3 Euryale 70B

Dec 2024

Euryale L3.3 70B is a model focused on creative roleplay from [Sao10k](https://ko-fi.com/sao10k). It is the successor of [Euryale L3 70B v2.2](/models/sao10k/l3-euryale-70b).

Web search
131K ¡ in $0.65 ¡ out $0.75

o1

Dec 2024

Previous full o-series reasoning model.

VisionReasoningTools / functions
200K ¡ in $15 ¡ out $60

Cohere: Command R7B (12-2024)

Dec 2024

Command R7B (12-2024) is a small, fast update of the Command R+ model, delivered in December 2024. It excels at RAG, tool use, agents, and similar tasks requiring complex reasoning...

Web search
128K ¡ in $0.04 ¡ out $0.15

Meta: Llama 3.3 70B Instruct (free) · 🎁

Dec 2024

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model...

Tools / functionsWeb search
131K ¡ in - ¡ out -

[Meta] Llama 3.3 ¡ 70B Versatile

Dec 2024

Meta Llama 3.3 (70B params) with GQA. Strong reasoning, coding, multilingual. 131K context, 32K max output. ~280 t/s on Groq.

Tools / functions
131K ¡ in $0.59 ¡ out $0.79

Amazon: Nova Pro 1.0

Dec 2024

Amazon Nova Pro 1.0 is a capable multimodal model from Amazon focused on providing a combination of accuracy, speed, and cost for a wide range of tasks. As of December...

VisionTools / functionsWeb search
300K ¡ in $0.8 ¡ out $3.2

Amazon: Nova Micro 1.0

Dec 2024

Amazon Nova Micro 1.0 is a text-only model that delivers the lowest latency responses in the Amazon Nova family of models at a very low cost. With a context length...

Tools / functionsWeb search
128K ¡ in $0.04 ¡ out $0.14

Amazon: Nova Lite 1.0

Dec 2024

Amazon Nova Lite 1.0 is a very low-cost multimodal model from Amazon that focused on fast processing of image, video, and text inputs to generate text output. Amazon Nova Lite...

VisionTools / functionsWeb search
300K ¡ in $0.06 ¡ out $0.24

Mistral Large 2407

Nov 2024

This is Mistral AI's flagship model, Mistral Large 2 (version mistral-large-2407). It's a proprietary weights-available model and excels at reasoning, code, JSON, chat, and more. Read the launch announcement [here](https://mistral.ai/news/mistral-large-2407/)....

Tools / functionsWeb search
131K ¡ in $2 ¡ out $6

Qwen2.5 Coder 32B Instruct

Nov 2024

Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). Qwen2.5-Coder brings the following improvements upon CodeQwen1.5: - Significantly improvements in **code generation**, **code reasoning**...

Web search
128K ¡ in $0.66 ¡ out $1

TheDrummer: UnslopNemo 12B

Nov 2024

UnslopNemo v4.1 is the latest addition from the creator of Rocinante, designed for adventure writing and role-play scenarios.

Tools / functionsWeb search
33K ¡ in $0.4 ¡ out $0.4

Magnum v4 72B

Oct 2024

This is a series of models designed to replicate the prose quality of the Claude 3 models, specifically Sonnet(https://openrouter.ai/anthropic/claude-3.5-sonnet) and Opus(https://openrouter.ai/anthropic/claude-3-opus). The model is fine-tuned on top of [Qwen2.5 72B](https://o...

Web search
33K ¡ in $3 ¡ out $5

Qwen: Qwen2.5 7B Instruct

Oct 2024

Qwen2.5 7B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2: - Significantly more knowledge and has greatly improved capabilities in coding and...

Tools / functionsWeb search
131K ¡ in $0.04 ¡ out $0.1

Inflection: Inflection 3 Productivity

Oct 2024

Inflection 3 Productivity is optimized for following instructions. It is better for tasks requiring JSON output or precise adherence to provided guidelines. It has access to recent news. For emotional...

Web search
8K ¡ in $2.5 ¡ out $10

Inflection: Inflection 3 Pi

Oct 2024

Inflection 3 Pi powers Inflection's [Pi](https://pi.ai) chatbot, including backstory, emotional intelligence, productivity, and safety. It has access to recent news, and excels in scenarios like customer support and roleplay. Pi...

Web search
8K ¡ in $2.5 ¡ out $10

TheDrummer: Rocinante 12B

Sep 2024

Rocinante 12B is designed for engaging storytelling and rich prose. Early testers have reported: - Expanded vocabulary with unique and expressive word choices - Enhanced creativity for vivid narratives -...

Web search
33K ¡ in $0.25 ¡ out $0.5

Meta: Llama 3.2 3B Instruct (free) · 🎁

Sep 2024

Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for advanced natural language processing tasks like dialogue generation, reasoning, and summarization. Designed with the latest transformer architecture, it...

Web search
131K ¡ in - ¡ out -

Meta: Llama 3.2 1B Instruct

Sep 2024

Llama 3.2 1B is a 1-billion-parameter language model focused on efficiently performing natural language tasks, such as summarization, dialogue, and multilingual text analysis. Its smaller size allows it to operate...

Web search
131K ¡ in $0.03 ¡ out $0.2

Meta: Llama 3.2 11B Vision Instruct

Sep 2024

Llama 3.2 11B Vision is a multimodal model with 11 billion parameters, designed to handle tasks combining visual and textual data. It excels in tasks such as image captioning and...

VisionWeb search
131K ¡ in $0.35 ¡ out $0.35

Qwen2.5 72B Instruct

Sep 2024

Qwen2.5 72B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2: - Significantly more knowledge and has greatly improved capabilities in coding and...

Tools / functionsWeb search
131K ¡ in $0.36 ¡ out $0.4

Cohere: Command R+ (08-2024)

Aug 2024

command-r-plus-08-2024 is an update of the [Command R+](/models/cohere/command-r-plus) with roughly 50% higher throughput and 25% lower latencies as compared to the previous Command R+ version, while keeping the hardware footprint...

Tools / functionsWeb search
128K ¡ in $2.5 ¡ out $10

Cohere: Command R (08-2024)

Aug 2024

command-r-08-2024 is an update of the [Command R](/models/cohere/command-r) with improved performance for multilingual retrieval-augmented generation (RAG) and tool use. More broadly, it is better at math, code and reasoning and...

Tools / functionsWeb search
128K ¡ in $0.15 ¡ out $0.6

Sao10K: Llama 3.1 Euryale 70B v2.2

Aug 2024

Euryale L3.1 70B v2.2 is a model focused on creative roleplay from [Sao10k](https://ko-fi.com/sao10k). It is the successor of [Euryale L3 70B v2.1](/models/sao10k/l3-euryale-70b).

Tools / functionsWeb search
131K ¡ in $0.85 ¡ out $0.85

Nous: Hermes 3 70B Instruct

Aug 2024

Hermes 3 is a generalist language model with many improvements over [Hermes 2](/models/nousresearch/nous-hermes-2-mistral-7b-dpo), including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements acr...

Web search
131K ¡ in $0.7 ¡ out $0.7

Nous: Hermes 3 405B Instruct (free) · 🎁

Aug 2024

Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the...

Web search
131K ¡ in - ¡ out -

Sao10K: Llama 3 8B Lunaris

Aug 2024

Lunaris 8B is a versatile generalist and roleplaying model based on Llama 3. It's a strategic merge of multiple models, designed to balance creativity with improved logic and general knowledge....

Web search
8K ¡ in $0.04 ¡ out $0.05

Meta: Llama 3.1 8B Instruct

Jul 2024

Meta's latest class of model (Llama 3.1) launched with a variety of sizes & flavors. This 8B instruct-tuned version is fast and efficient. It has demonstrated strong performance compared to...

Tools / functionsWeb search
131K ¡ in $0.02 ¡ out $0.03

Meta: Llama 3.1 70B Instruct

Jul 2024

Meta's latest class of model (Llama 3.1) launched with a variety of sizes & flavors. This 70B instruct-tuned version is optimized for high quality dialogue usecases. It has demonstrated strong...

Tools / functionsWeb search
131K ¡ in $0.4 ¡ out $0.4

[Meta] Llama 3.1 ¡ 8B Instant

Jul 2024

Meta Llama 3.1 (8B params). Fast, cost-effective for high-volume tasks. 131K context and max output. ~560 t/s on Groq.

Tools / functions
131K ¡ in $0.05 ¡ out $0.08

Mistral: Mistral Nemo

Jul 2024

A 12B parameter model with a 128k token context length built by Mistral in collaboration with NVIDIA. The model is multilingual, supporting English, French, German, Spanish, Italian, Portuguese, Chinese, Japanese,...

Tools / functionsWeb search
131K ¡ in $0.02 ¡ out $0.03

open-mistral-nemo-2407

Jul 2024

Our best multilingual open source model released July 2024.

Tools / functions
131K ¡ in $0.15 ¡ out $0.15

open-mistral-nemo

Jul 2024

Our best multilingual open source model released July 2024.

Tools / functions
131K ¡ in $0.15 ¡ out $0.15

GPT-4o Mini

Jul 2024

Affordable model for fast, lightweight tasks. GPT-4o Mini is cheaper and more capable than GPT-3.5 Turbo.

VisionTools / functions
128K ¡ in $0.15 ¡ out $0.6

Google: Gemma 2 27B

Jul 2024

Gemma 2 27B by Google is an open model built from the same research and technology used to create the [Gemini models](/models?q=gemini). Gemma models are well-suited for a variety of...

Web search
8K ¡ in $0.65 ¡ out $0.65

GPT-4o

deprecated
May 2024

Original gpt-4o snapshot from May 13, 2024.

VisionTools / functions
128K ¡ in $5 ¡ out $15

Meta: Llama 3 8B Instruct

Apr 2024

Meta's latest class of model (Llama 3) launched with a variety of sizes & flavors. This 8B instruct-tuned version was optimized for high quality dialogue usecases. It has demonstrated strong...

Web search
8K ¡ in $0.14 ¡ out $0.14

Mistral: Mixtral 8x22B Instruct

Apr 2024

Mistral's official instruct fine-tuned version of [Mixtral 8x22B](/models/mistralai/mixtral-8x22b). It uses 39B active parameters out of 141B, offering unparalleled cost efficiency for its size. Its strengths include: - strong math, coding,...

Tools / functionsWeb search
66K ¡ in $2 ¡ out $6

WizardLM-2 8x22B

Apr 2024

WizardLM-2 8x22B is Microsoft AI's most advanced Wizard model. It demonstrates highly competitive performance compared to leading proprietary models, and it consistently outperforms all existing state-of-the-art opensource models. It is...

Web search
66K ¡ in $0.62 ¡ out $0.62

GPT-4 Turbo

deprecated
Apr 2024

GPT-4 Turbo with Vision model. Vision requests can now use JSON mode and function calling. gpt-4-turbo currently

VisionTools / functions
128K ¡ in $10 ¡ out $30

Anthropic: Claude 3 Haiku

Mar 2024

Claude 3 Haiku is Anthropic's fastest and most compact model for near-instant responsiveness. Quick and accurate targeted performance. See the launch announcement and benchmark results [here](https://www.anthropic.com/news/claude-3-haiku) #multimodal

VisionTools / functionsWeb search
200K ¡ in $0.25 ¡ out $1.25

Mistral Large

Feb 2024

This is Mistral AI's flagship model, Mistral Large 2 (version `mistral-large-2407`). It's a proprietary weights-available model and excels at reasoning, code, JSON, chat, and more. Read the launch announcement [here](https://mistral.ai/news/mistral-large-2407/)....

Tools / functionsWeb search
128K ¡ in $2 ¡ out $6

V1 8K

Feb 2024

Legacy V1 model with 8K context. Deprecated - use Kimi K2 Instruct instead.

Tools / functions
8K ¡ in $0.2 ¡ out $2

V1 32K

Feb 2024

Legacy V1 model with 32K context. Deprecated - use Kimi K2 Instruct instead.

Tools / functions
33K ¡ in $1 ¡ out $3

V1 128K

Feb 2024

Legacy V1 model with 128K context. Deprecated - use Kimi K2 Instruct instead.

Tools / functions
131K ¡ in $2 ¡ out $5

OpenAI: GPT-4 Turbo Preview

Jan 2024

The preview GPT-4 model with improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more. Training data: up to Dec 2023. **Note:** heavily rate limited by OpenAI while...

Tools / functionsWeb search
128K ¡ in $10 ¡ out $30

OpenAI: GPT-3.5 Turbo (older v0613)

Jan 2024

GPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural language or code, and is optimized for chat and traditional completion tasks. Training data up to Sep 2021.

Tools / functionsWeb search
4K ¡ in $1 ¡ out $2

3.5-Turbo

Jan 2024

The latest GPT-3.5 Turbo model with higher accuracy at responding in requested formats.

Tools / functions
16K ¡ in $0.5 ¡ out $1.5

3.5-Turbo

deprecated
Jan 2024

The latest GPT-3.5 Turbo model with higher accuracy at responding in requested formats.

Tools / functions
16K ¡ in $0.5 ¡ out $1.5

mistral-medium

Dec 2023

Official mistral-medium-latest Mistral AI model

VisionTools / functions
262K ¡ in $0.4 ¡ out $2

Auto Router

Nov 2023

Your prompt will be processed by a meta-model and routed to one of dozens of models (see below), optimizing for the best possible output. To see which model was used,...

VisionReasoningTools / functionsWeb searchImage output
2M ¡ in - ¡ out -

3.5-Turbo

deprecated
Nov 2023

GPT-3.5 Turbo model with improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more.

Tools / functions
16K ¡ in $1 ¡ out $2

OpenAI: GPT-3.5 Turbo Instruct

Sep 2023

This model is a variant of GPT-3.5 Turbo tuned for instructional prompts and omitting chat-related optimizations. Training data: up to Sep 2021.

Web search
4K ¡ in $1.5 ¡ out $2

OpenAI: GPT-3.5 Turbo 16k

Aug 2023

This model offers four times the context length of gpt-3.5-turbo, allowing it to support approximately 20 pages of text in a single request at a higher cost. Training data: up...

Tools / functionsWeb search
16K ¡ in $3 ¡ out $4

Mancer: Weaver (alpha)

Aug 2023

An attempt to recreate Claude-style verbosity, but don't expect the same level of coherence or memory. Meant for use in roleplay/narrative situations.

Web search
8K ¡ in $0.75 ¡ out $1

ReMM SLERP 13B

Jul 2023

A recreation trial of the original MythoMax-L2-B13 but with updated models. #merge

Web search
6K ¡ in $0.45 ¡ out $0.65

MythoMax 13B

Jul 2023

One of the highest performing and most popular fine-tunes of Llama 2 13B, with rich descriptions and roleplay. #merge

Web search
4K ¡ in $0.06 ¡ out $0.06

GPT-4

deprecated
Jun 2023

Snapshot of gpt-4 from June 13th 2023 with improved function calling support. Data up to Sep 2021.

Tools / functions
8K ¡ in $30 ¡ out $60

GPT-4

Jun 2023

Snapshot of gpt-4 from June 13th 2023 with improved function calling support. Data up to Sep 2021.

Tools / functions
8K ¡ in $30 ¡ out $60

[?] Qwen Plus [latest]

-

Balanced quality, speed, and cost with hybrid thinking. 1M context.

ReasoningTools / functions
1M ¡ in $0.4 ¡ out $1.2

Labs Leanstral 1 5 1

-

A mid & post-trained version of mistral small 4 for Lean (260618 SFT)

VisionTools / functions
262K ¡ in - ¡ out -

labs-leanstral-1-5

-

A mid & post-trained version of mistral small 4 for Lean (260618 SFT)

VisionTools / functions
262K ¡ in - ¡ out -

mistral-code-agent-latest

-

Official devstral-2512 Mistral AI model

Tools / functions
262K ¡ in - ¡ out -

mistral-code-fim-latest

-

Our cutting-edge language model for coding released August 2025.

Tools / functions
256K ¡ in - ¡ out -

mistral-code-latest

-

Our cutting-edge language model for coding released August 2025.

Tools / functions
256K ¡ in - ¡ out -

mistral-tiny-latest

-

Our best multilingual open source model released July 2024.

Tools / functions
131K ¡ in - ¡ out -

mistral-vibe-cli-fast

-

Mistral Small 4.

VisionTools / functions
262K ¡ in - ¡ out -

mistral-vibe-cli-with-tools

-

Official mistral-medium-latest Mistral AI model

VisionTools / functions
262K ¡ in - ¡ out -

Open Mistral Nemo

-

Our best multilingual open source model released July 2024.

Tools / functions
131K ¡ in - ¡ out -

Qvq Max

-

Alibaba model (not yet curated).

Tools / functions
131K ¡ in - ¡ out -

Qwen Coder Plus

-

Alibaba model (not yet curated).

Tools / functions
131K ¡ in - ¡ out -

Qwen Flash

-

Fast and very low cost with hybrid thinking. 1M context.

ReasoningTools / functions
1M ¡ in $0.05 ¡ out $0.4

Qwen Max

-

Best quality of the stable commercial line. 32K context.

Tools / functions
33K ¡ in $1.6 ¡ out $6.4

Qwen Turbo

-

Fastest and cheapest for simple tasks. 1M context.

Tools / functions
1M ¡ in $0.05 ¡ out $0.2

Qwen Vl Max

-

Alibaba model (not yet curated).

Tools / functions
131K ¡ in - ¡ out -

Qwen Vl Plus

-

Alibaba model (not yet curated).

Tools / functions
131K ¡ in - ¡ out -

Qwen3 235b A22b Instruct 2507

-

Alibaba model (not yet curated).

Tools / functions
131K ¡ in - ¡ out -

Qwen3 Coder 480b A35b Instruct

-

Alibaba model (not yet curated).

Tools / functions
131K ¡ in - ¡ out -

Qwen3 Max Preview

-

Alibaba model (not yet curated).

Tools / functions
131K ¡ in - ¡ out -

Qwen3 Vl Flash 2025 10 15

-

Alibaba model (not yet curated).

Tools / functions
131K ¡ in - ¡ out -

Qwen3.5 Flash 2026 02 23

-

Alibaba model (not yet curated).

Tools / functions
131K ¡ in - ¡ out -

Qwq Plus 2025 03 05

-

Alibaba model (not yet curated).

Tools / functions
131K ¡ in - ¡ out -
Showing 412 of 412 models ¡ prices in USD per 1M tokens ¡ refreshed every 30 minutes

Run any of these in Big-AGI.

Connect your own keys, run models side by side, then compare and merge the answers. Keys and chats stay in your browser.

Launch Big-AGI

Š 2026 Token Fabrics¡Built with passion in San Diego