BEAM

Features

Rankings

Pro

Docs

GitHub

LAUNCH APP

Use Together AI Models in Big-AGI.

Bring your own key: Together AI's API rates, no markup. Keys and chats stay in your browser. Run Together AI in parallel with other models, then compare and merge the answers.

Inkling FP4

Qwen3.5 2B Lora

Qwen3.6 35B A3B Lora

Launch Big-AGI

All supported Together AI models

ModelContextInputOutputReleased

Inkling FP4

NEW

Thinking Machines chat model. https://huggingface.co/api/models/thinkingmachines/Inkling-NVFP4

524K

$4.05

Jul 2026

Qwen3.5 2B Lora

NEW

Qwen chat model.

262K

Jun 2026

Qwen3.6 35B A3B Lora

NEW

Qwen chat model.

262K

Jun 2026

GLM 5.2

NEW

Zai Org chat model. https://huggingface.co/api/models/togethercomputer/GLM-5.2-FP4-0617-skipedge

262K

$1.4

$4.4

Jun 2026

NVIDIA Nemotron 3 Ultra 550B A55B NVFP4

NVIDIA chat model.

512K

$0.6

$3.6

Jun 2026

Llama 4 Maverick 17B 128E Instruct Nvfp4

Meta chat model. https://huggingface.co/api/models/RedHatAI/Llama-4-Maverick-17B-128E-Instruct-NVFP4

Jun 2026

GLM 4.7 FP4

Zai Org chat model.

203K

Jun 2026

Kimi K2.7 Code

Moonshot AI chat model. https://huggingface.co/api/models/togethercomputer/Kimi-K2.7-Code-FP4

262K

$0.95

Jun 2026

Qwen3.7 Plus

Qwen chat model.

$0.32

$1.28

Jun 2026

MiniMax M3

MiniMaxAI chat model.

524K

$0.3

$1.2

May 2026

Qwen3.7 Max

Qwen chat model.

$1.25

$3.75

May 2026

Llama 4 Scout 17B 16E Instruct Fp8 Lora

Meta chat model.

10M

May 2026

Gemma 4 31B It Lora

Google chat model. https://huggingface.co/api/models/google/gemma-4-31B-it

262K

May 2026

Gemma 3 27B It Lora

Google chat model.

May 2026

Mixtral 8x7B Instruct V0.1 FP8 Lora

Mistral AI chat model.

33K

May 2026

Gemma 3 270M It Lora

Google chat model.

33K

May 2026

Llama 3.3 70B Instruct FP8 Lora

Meta chat model.

131K

May 2026

Nemotron 3 Nano Omni 30B A3b Reasoning Fp8

Nvidia chat model.

131K

Apr 2026

Deepseek V4 Pro

Deepseek chat model.

512K

$1.74

$3.48

Apr 2026

Qwen3.6 35B A3b Fp8

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3.6-35B-A3B-FP8

262K

Apr 2026

Gemma 4 E2B-it

Google chat model. https://huggingface.co/api/models/google/gemma-4-E2B-it

131K

Apr 2026

Kimi K2.6 Fp4

Moonshot AI chat model. https://huggingface.co/api/models/togethercomputer/Kimi-K2.6-FP4

262K

$1.2

$4.5

Apr 2026

Gemma 4 E4B-it

Google chat model. https://huggingface.co/google/gemma-4-E4B-it

131K

Apr 2026

GLM 5.1 FP4

Zai Org chat model. https://huggingface.co/api/models/togethercomputer/GLM-5.1-FP4

203K

$1.4

$4.4

Apr 2026

Nvidia Nemotron 3 Super 120B A12b Bf16

Nvidia chat model. https://huggingface.co/api/models/nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16

262K

Apr 2026

Gemma 4 26B A4b It

Google chat model. https://huggingface.co/api/models/google/gemma-4-26B-A4B-it

262K

Apr 2026

Pearl-ai Gemma-4-31B-it-pearl

pearl.ai chat model. https://huggingface.co/pearl-ai/Gemma-4-31B-it-pearl

262K

$0.28

$0.86

Apr 2026

Qwen3.6 Plus

Qwen chat model.

$0.5

Apr 2026

Holo3 35B A3b

Hcompany chat model. https://huggingface.co/api/models/Hcompany/Holo3-35B-A3B

262K

Mar 2026

Deepseek V3.1 NVFP4

DeepSeek chat model. https://huggingface.co/deepseek-ai/DeepSeek-V3.1

131K

$0.6

$1.7

Mar 2026

Qwen3 30B A3B Instruct 2507 Lora

Qwen chat model.

262K

Mar 2026

MiniMax M2.7 FP4

MiniMaxAI chat model. https://huggingface.co/api/models/togethercomputer/M2.5plus-fp4

197K

$0.3

$1.2

Mar 2026

Qwen3 8B Lora

Qwen chat model.

41K

Mar 2026

Qwen3.5 122B A10b Fp8

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3.5-122B-A10B-FP8

262K

Mar 2026

Deepseek OCR 2

Deepseek chat model. https://huggingface.co/api/models/deepseek-ai/DeepSeek-OCR-2

Mar 2026

Nvidia Nemotron 3 Super 120B A12b Fp8

Nvidia chat model. https://huggingface.co/api/models/nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-FP8

262K

Mar 2026

Qwen3.5 9B FP8

Qwen chat model. https://huggingface.co/api/models/togethercomputer/Qwen3.5-9B-FP8-MLP

262K

$0.17

$0.25

Mar 2026

Qwen3.5 9B Fp8

Qwen chat model. https://huggingface.co/api/models/togethercomputer/Qwen3.5-9B-FP8-MLP

262K

Mar 2026

Glm 4.7 Fp8

Zai Org chat model. https://huggingface.co/api/models/zai-org/GLM-4.7-FP8

203K

Mar 2026

Qwen3.5 35B A3b

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3.5-35B-A3B

262K

Feb 2026

Qwen3.5 397B A17b

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3.5-397B-A17B

262K

$0.6

$3.6

Feb 2026

MiniMax M2.5 FP4

MiniMaxAI chat model.

Feb 2026

GLM 5 Fp4

Zai Org chat model. https://huggingface.co/api/models/togethercomputer/GLM-5-FP4

203K

Feb 2026

GLM 5 Fp4

Zai Org chat model. https://huggingface.co/api/models/togethercomputer/GLM-5-FP4

203K

$3.2

Feb 2026

GLM OCR

Zai Org chat model. https://huggingface.co/api/models/zai-org/GLM-OCR

131K

Feb 2026

GLM 4.7 FP8

Zai Org chat model. https://huggingface.co/api/models/zai-org/GLM-4.7-fp8

203K

$0.45

Dec 2025

Nvidia Nemotron 3 Nano 30B A3b Bf16

Nvidia chat model. https://huggingface.co/api/models/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16

262K

Dec 2025

EssentialAI Rnj-1 Instruct

Essential AI chat model. https://huggingface.co/api/models/togethercomputer/EssentialAI-RNJ-1-Instruct

33K

Dec 2025

Ministral 3 14B Instruct 2512

Mistralai chat model. https://huggingface.co/api/models/mistralai/Ministral-3-14B-Instruct-2512

262K

$0.2

Dec 2025

Trinity Mini

Arcee AI chat model. https://huggingface.co/api/models/togethercomputer/arcee-trinity-mini-rc

128K

$0.05

$0.15

Dec 2025

Cogito v2.1 671B

Deepcogito chat model. https://huggingface.co/api/models/togethercomputer/cogito-671b-v2.1-exp-chkp-2

164K

$1.25

Nov 2025

Qwen3-VL-235B-A22B-Instruct-FP8

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-VL-235B-A22B-Instruct-FP8

262K

Nov 2025

Medgemma 27B Text It

Google chat model. https://huggingface.co/api/models/google/medgemma-27b-text-it

131K

Oct 2025

Qwen3-VL-32B-Instruct

Qwen chat model. https://huggingface.co/Qwen/Qwen3-VL-32B-Instruct

262K

$0.5

$1.5

Oct 2025

MiniMax M2

MiniMaxAI chat model. https://huggingface.co/MiniMaxAI/MiniMax-M2

197K

Oct 2025

Qwen3-VL-8B-Instruct

Qwen chat model. https://huggingface.co/Qwen/Qwen3-VL-8B-Instruct

262K

$0.18

$0.68

Oct 2025

GLM 4.6 Fp8

Zai Org chat model. https://huggingface.co/api/models/zai-org/GLM-4.6-FP8

203K

$0.6

$2.2

Sep 2025

Gemma 3 270M It

Google chat model. https://huggingface.co/api/models/google/gemma-3-270m-it

33K

Sep 2025

Qwen3 Next 80B A3b Instruct Fp8

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-Next-80B-A3B-Instruct-FP8

Sep 2025

Nvidia Nemotron Nano 9B V2

Nvidia chat model. https://huggingface.co/api/models/nvidia/NVIDIA-Nemotron-Nano-9B-v2

131K

$0.06

$0.25

Sep 2025

Qwen3 Next 80B A3b Instruct

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-Next-80B-A3B-Instruct

262K

$0.15

$1.5

Sep 2025

Qwen3 Next 80B A3b Thinking

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-Next-80B-A3B-Thinking

262K

$0.15

$1.5

Sep 2025

GLM 4.5V

Zai Org chat model. https://huggingface.co/api/models/zai-org/GLM-4.5V

66K

Aug 2025

Qwen3 4B Instruct 2507

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-4B-Instruct-2507

262K

Aug 2025

OpenAI GPT-OSS 20B

OpenAI chat model. https://huggingface.co/api/models/openai/gpt-oss-20b

131K

$0.05

$0.2

Aug 2025

OpenAI GPT-OSS 120B

OpenAI chat model. https://huggingface.co/openai/gpt-oss-120b

131K

$0.15

$0.6

Aug 2025

Qwen3 Coder 30B A3b Instruct

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-Coder-30B-A3B-Instruct

262K

Jul 2025

Glm 4.5 Air Fp8

Zai Org chat model. https://huggingface.co/api/models/zai-org/GLM-4.5-Air-FP8

131K

$0.2

$1.1

Jul 2025

Qwen3 235B A22b Instruct 2507 Fp8

Together AI chat model. https://huggingface.co/api/models/Qwen/Qwen3-235B-A22B-Instruct-2507-FP8

262K

Jul 2025

Qwen3 Coder 480B A35B Instruct Fp8

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8

262K

Jul 2025

Qwen3 235B A22B Instruct 2507 FP8 Throughput

deprecated

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-235B-A22B-Instruct-2507-FP8

262K

$0.2

$0.6

Jul 2025

Sarvam M

Sarvamai chat model. https://huggingface.co/api/models/sarvamai/sarvam-m

33K

Jul 2025

Meta Llama 3.1 8B Instruct Awq Int4

Meta chat model. https://huggingface.co/api/models/togethercomputer/meta-llama-3.1-8B-Instruct-AWQ-INT4

131K

Jul 2025

Minimax M1 80K

MiniMaxAI chat model. https://huggingface.co/api/models/togethercomputer/MiniMax-M1-80k

Jun 2025

Minimax M1 40K

MiniMaxAI chat model. https://huggingface.co/api/models/togethercomputer/MiniMax-M1-40k

Jun 2025

Llama 4 Scout (17Bx16E)

Meta chat model. https://huggingface.co/api/models/meta-llama/Llama-4-Scout-17B-16E

262K

Jun 2025

Magistral Small 2506

Mistralai chat model. https://huggingface.co/api/models/mistralai/Magistral-Small-2506

41K

Jun 2025

Gemma 2B It

Google chat model. https://huggingface.co/api/models/google/gemma-2b-it

Jun 2025

Gemma 2 9B It

Google chat model. https://huggingface.co/api/models/google/gemma-2-9b-it

Jun 2025

Qwen3 1.7B

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-1.7B

41K

Jun 2025

Qwen3 0.6B

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-0.6B

41K

Jun 2025

DeepSeek R1 0528 NVFP4

Deepseek chat model. https://huggingface.co/api/models/deepseek-ai/DeepSeek-R1-0528

164K

May 2025

Molmo 7B D 0924

Allenai chat model. https://huggingface.co/api/models/allenai/Molmo-7B-D-0924

May 2025

Mixtral 8X22b Instruct V0.1

Mistralai chat model. https://huggingface.co/api/models/mistralai/Mixtral-8x22B-Instruct-v0.1

66K

May 2025

Devstral Small 2505

Mistralai chat model. https://huggingface.co/api/models/togethercomputer/Devstral-Small-2505

131K

May 2025

Mistral 7B v0.1

Mistralai chat model. https://huggingface.co/api/models/mistralai/Mistral-7B-v0.1

33K

May 2025

Gemma 3N E4B Instruct

Google chat model. https://huggingface.co/google/gemma-3n-E4B-it

33K

$0.06

$0.12

May 2025

Deepcoder 14B Preview

Togethercomputer chat model. https://huggingface.co/api/models/togethercomputer/DeepCoder-14B-Preview

131K

May 2025

Qwen3 32B

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-32B

41K

Apr 2025

Qwen3 8B

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-8B

41K

Apr 2025

Qwen3 30B A3b

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-30B-A3B

41K

Apr 2025

Qwen3 14B

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-14B

Apr 2025

Arize AI Qwen 2 1.5B Instruct

Togethercomputer chat model. https://huggingface.co/api/models/togethercomputer/arize-ai-qwen-2-1.5b-instruct

33K

$0.1

Apr 2025

Llama 3.1 405B

Meta chat model. https://huggingface.co/api/models/meta-llama/Llama-3.1-405B

131K

Apr 2025

Llama 3.1 70B

Meta chat model. https://huggingface.co/api/models/meta-llama/Llama-3.1-70B

131K

Apr 2025

Qwen2.5 32B

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-32B

131K

Apr 2025

Qwen2.5 14B

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-14B

131K

Apr 2025

Llama 3.2 1B

Meta chat model. https://huggingface.co/api/models/meta-llama/Llama-3.2-1B

131K

Apr 2025

Qwen2.5 1.5B Instruct

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-1.5B-Instruct

33K

Apr 2025

Qwen2.5 7B Instruct

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-7B-Instruct

33K

Apr 2025

Qwen2.5 7B

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-7B

131K

Apr 2025

Qwen2.5 72B

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-72B

131K

Apr 2025

Qwen2.5 1.5B

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-1.5B

131K

Apr 2025

Qwen2.5 32B Instruct

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-32B-Instruct

33K

Apr 2025

Qwen2.5 3B Instruct

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-3B-Instruct

33K

Apr 2025

Qwen2 72B Instruct

Togethercomputer chat model. https://huggingface.co/api/models/togethercomputer/Qwen2-72B-Instruct

33K

$0.9

Apr 2025

Cogito V1 Preview Llama 70B

deepcogito chat model.

131K

Apr 2025

Cogito V1 Preview Llama 8B

deepcogito chat model.

131K

Apr 2025

Cogito V1 Preview Qwen 32B

deepcogito chat model.

131K

Apr 2025

Cogito V1 Preview Llama 70B Turbo

deepcogito chat model.

131K

Apr 2025

Cogito V1 Preview Qwen 14B

deepcogito chat model.

131K

Apr 2025

Llama 4 Scout Instruct (17Bx16E)

Meta chat model. https://huggingface.co/meta-llama/Llama-4-Scout-17B-16E-Instruct

$0.18

$0.59

Apr 2025

Gemma 3 1b it

Google chat model.

33K

Apr 2025

DeepSeek R1 Distill Qwen 7B

Deepseek chat model. https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B

131K

Apr 2025

meta-llama/Llama-2-7b-chat-hf

Meta chat model. https://huggingface.co/meta-llama/Llama-2-7b-chat-hf

Apr 2025

nim/nvidia/llama-3.3-nemotron-super-49b-v1

Nvidia chat model.

16K

Mar 2025

nim/meta/llama-3.1-70b-instruct

Llama chat model.

16K

Mar 2025

nim/nv-mistralai/mistral-nemo-12b-instruct

NVIDIA chat model.

16K

Mar 2025

nim/mistralai/mixtral-8x7b-instruct-v01

mistralai chat model.

16K

Mar 2025

Gemma 3 4b it

Google chat model.

66K

Mar 2025

nim/meta/llama-3.1-8b-instruct

Meta chat model.

16K

Mar 2025

nim/meta/llama-3.3-70b-instruct

Meta chat model.

16K

Mar 2025

Gemma 3 27B It

Google chat model. https://huggingface.co/api/models/google/gemma-3-27b-it

66K

Mar 2025

nim/nvidia/llama-3.1-nemotron-70b-instruct

NVIDIA chat model.

16K

Mar 2025

nim/meta/llama-3.2-11b-vision-instruct

Nvidia chat model.

16K

Mar 2025

nim/meta/llama-3.2-90b-vision-instruct

Meta chat model.

16K

Mar 2025

nim/mistralai/mixtral-8x22b-instruct-v01

Mistral chat model.

16K

Mar 2025

Meta Llama 3.1 8B Instruct Turbo

Meta chat model. https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct

131K

$0.18

Mar 2025

Qwen QwQ-32B

Qwen chat model. https://huggingface.co/Qwen/QwQ-32B

131K

$1.2

Mar 2025

Qwen2.5-VL (72B) Instruct

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-VL-72B-Instruct

33K

$1.95

Feb 2025

Mistral Small (24B) Instruct 25.01

mistralai chat model. https://huggingface.co/mistralai/Mistral-Small-Instruct-2501

33K

$0.1

$0.3

Jan 2025

DeepSeek R1 Distill Qwen 1.5B

DeepSeek chat model. https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

131K

$0.18

Jan 2025

DeepSeek R1 Distill Qwen 14B

DeepSeek chat model. https://huggingface.co/api/models/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B

131K

$1.6

Jan 2025

DeepSeek R1 Distill Llama 70B

DeepSeek chat model. https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B

131K

Jan 2025

Qwen2-VL (72B) Instruct

Qwen chat model. https://huggingface.co/Qwen/Qwen2-VL-72B-Instruct

33K

$1.2

Jan 2025

Qwen 2.5 14B Instruct

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-14B-Instruct

33K

$0.8

Dec 2024

Meta Llama 3.3 70B Instruct

meta-llama chat model. https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct

131K

Dec 2024

Meta Llama 3.1 405B Instruct

Meta chat model. https://huggingface.co/meta-llama/Llama-3.1-405B-Instruct

$3.5

Dec 2024

Meta Llama 3.3 70B Instruct Turbo

Meta chat model. https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct

131K

$1.04

Dec 2024

Qwen2.5 72B Instruct

Qwen chat model. https://huggingface.co/Qwen/Qwen2.5-72B-Instruct

33K

$1.2

Dec 2024

Qwen 2.5 Coder 32B Instruct

Qwen chat model. https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct

16K

$0.8

Nov 2024

Llama 3.1 Nemotron 70B Instruct HF

nvidia chat model. https://huggingface.co/nvidia/Llama-3.1-Nemotron-70B-Instruct-HF

33K

$0.88

Nov 2024

Qwen2.5 72B Instruct Turbo

Qwen chat model. https://huggingface.co/Qwen/Qwen2.5-72B-Instruct

131K

$1.2

Oct 2024

Qwen2.5 7B Instruct Turbo

Qwen chat model. https://huggingface.co/Qwen/Qwen2.5-7B-Instruct

33K

$0.3

Oct 2024

Meta Llama 3.2 3B Instruct

Meta chat model. https://huggingface.co/api/models/meta-llama/Llama-3.2-3B-Instruct

131K

$0.06

Sep 2024

Meta Llama 3.2 1B Instruct

Meta chat model. https://huggingface.co/api/models/meta-llama/Llama-3.2-1B-Instruct

131K

$0.06

Sep 2024

Meta Llama 3.1 70B Instruct Turbo

Meta chat model. https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct

131K

$0.88

Jul 2024

Gemma-2 Instruct (27B)

Google chat model. https://huggingface.co/google/gemma-2b-it

$0.8

Jul 2024

Qwen 2 Instruct (1.5B)

Qwen chat model. https://huggingface.co/Qwen/Qwen2-72B-Instruct

33K

$0.02

Jun 2024

Mistral (7B) Instruct v0.3

mistralai chat model. https://huggingface.co/api/models/mistralai/Mistral-7B-Instruct-v0.3

33K

$0.2

May 2024

Meta Llama 3 8B Instruct

Meta chat model. https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct

$0.2

Apr 2024

Meta Llama 3 8B Instruct Reference

Meta chat model. https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct

$0.2

Apr 2024

Deepseek Coder 33B Instruct

Deepseek chat model. https://huggingface.co/api/models/deepseek-ai/deepseek-coder-33b-instruct

16K

$0.8

Feb 2024

Nous Hermes 2 Mixtral 8X7B Dpo

Nousresearch chat model. https://huggingface.co/api/models/NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO

33K

$0.6

Jan 2024

Mixtral-8x7B Instruct v0.1

mistralai chat model. https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1

33K

$0.6

Dec 2023

Mistral (7B) Instruct v0.1

mistralai chat model. https://huggingface.co/api/models/mistralai/Mistral-7B-Instruct-v0.1

33K

$0.2

Sep 2023

LFM2.5-8B-A1B

LiquidAI chat model. https://huggingface.co/api/models/LiquidAI/LFM2.5-8B-A1B

128K

$0.03

$0.12

Qwen3.5 35B A3B Lora

Qwen chat model.

262K

Ternary Bonsai 27B

deprecated

Prism ML chat model. https://huggingface.co/api/models/prism-ml/Ternary-Bonsai-27B-AWQ-4bit

262K

Gemma 4 12B It

Google chat model. https://huggingface.co/google/gemma-4-12B-it

262K

Meta Llama 3 70B Instruct Turbo

Meta chat model. https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct

$0.88

Qwen3 Coder Next Fp8

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-Coder-Next-FP8

262K

$0.5

$1.2

LFM2-24B-A2B

deprecated

Togethercomputer chat model.

33K

$0.03

$0.12

Kimi K2.5 Fp4

Togethercomputer chat model. https://huggingface.co/api/models/togethercomputer/Kimi-K2.5-fp4

262K

$0.5

$2.8

Meta Llama 3 8B Instruct Lite

deprecated

Meta chat model. https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct

$0.14

Inkling FP4

NEW

Jul 2026

Thinking Machines chat model. https://huggingface.co/api/models/thinkingmachines/Inkling-NVFP4

524K · in $1 · out $4.05

Qwen3.5 2B Lora

NEW

Jun 2026

Qwen chat model.

262K · in - · out -

Qwen3.6 35B A3B Lora

NEW

Jun 2026

Qwen chat model.

262K · in - · out -

GLM 5.2

NEW

Jun 2026

Zai Org chat model. https://huggingface.co/api/models/togethercomputer/GLM-5.2-FP4-0617-skipedge

262K · in $1.4 · out $4.4

NVIDIA Nemotron 3 Ultra 550B A55B NVFP4

Jun 2026

NVIDIA chat model.

512K · in $0.6 · out $3.6

Llama 4 Maverick 17B 128E Instruct Nvfp4

Jun 2026

Meta chat model. https://huggingface.co/api/models/RedHatAI/Llama-4-Maverick-17B-128E-Instruct-NVFP4

1M · in - · out -

GLM 4.7 FP4

Jun 2026

Zai Org chat model.

203K · in - · out -

Kimi K2.7 Code

Jun 2026

Moonshot AI chat model. https://huggingface.co/api/models/togethercomputer/Kimi-K2.7-Code-FP4

262K · in $0.95 · out $4

Qwen3.7 Plus

Jun 2026

Qwen chat model.

1M · in $0.32 · out $1.28

MiniMax M3

May 2026

MiniMaxAI chat model.

524K · in $0.3 · out $1.2

Qwen3.7 Max

May 2026

Qwen chat model.

1M · in $1.25 · out $3.75

Llama 4 Scout 17B 16E Instruct Fp8 Lora

May 2026

Meta chat model.

10M · in - · out -

Gemma 4 31B It Lora

May 2026

Google chat model. https://huggingface.co/api/models/google/gemma-4-31B-it

262K · in - · out -

Gemma 3 27B It Lora

May 2026

Google chat model.

- · in - · out -

Mixtral 8x7B Instruct V0.1 FP8 Lora

May 2026

Mistral AI chat model.

33K · in - · out -

Gemma 3 270M It Lora

May 2026

Google chat model.

33K · in - · out -

Llama 3.3 70B Instruct FP8 Lora

May 2026

Meta chat model.

131K · in - · out -

Nemotron 3 Nano Omni 30B A3b Reasoning Fp8

Apr 2026

Nvidia chat model.

131K · in - · out -

Deepseek V4 Pro

Apr 2026

Deepseek chat model.

512K · in $1.74 · out $3.48

Qwen3.6 35B A3b Fp8

Apr 2026

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3.6-35B-A3B-FP8

262K · in - · out -

Gemma 4 E2B-it

Apr 2026

Google chat model. https://huggingface.co/api/models/google/gemma-4-E2B-it

131K · in - · out -

Kimi K2.6 Fp4

Apr 2026

Moonshot AI chat model. https://huggingface.co/api/models/togethercomputer/Kimi-K2.6-FP4

262K · in $1.2 · out $4.5

Gemma 4 E4B-it

Apr 2026

Google chat model. https://huggingface.co/google/gemma-4-E4B-it

131K · in - · out -

GLM 5.1 FP4

Apr 2026

Zai Org chat model. https://huggingface.co/api/models/togethercomputer/GLM-5.1-FP4

203K · in $1.4 · out $4.4

Nvidia Nemotron 3 Super 120B A12b Bf16

Apr 2026

Nvidia chat model. https://huggingface.co/api/models/nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16

262K · in - · out -

Gemma 4 26B A4b It

Apr 2026

Google chat model. https://huggingface.co/api/models/google/gemma-4-26B-A4B-it

262K · in - · out -

Pearl-ai Gemma-4-31B-it-pearl

Apr 2026

pearl.ai chat model. https://huggingface.co/pearl-ai/Gemma-4-31B-it-pearl

262K · in $0.28 · out $0.86

Qwen3.6 Plus

Apr 2026

Qwen chat model.

1M · in $0.5 · out $3

Holo3 35B A3b

Mar 2026

Hcompany chat model. https://huggingface.co/api/models/Hcompany/Holo3-35B-A3B

262K · in - · out -

Deepseek V3.1 NVFP4

Mar 2026

DeepSeek chat model. https://huggingface.co/deepseek-ai/DeepSeek-V3.1

131K · in $0.6 · out $1.7

Qwen3 30B A3B Instruct 2507 Lora

Mar 2026

Qwen chat model.

262K · in - · out -

MiniMax M2.7 FP4

Mar 2026

MiniMaxAI chat model. https://huggingface.co/api/models/togethercomputer/M2.5plus-fp4

197K · in $0.3 · out $1.2

Qwen3 8B Lora

Mar 2026

Qwen chat model.

41K · in - · out -

Qwen3.5 122B A10b Fp8

Mar 2026

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3.5-122B-A10B-FP8

262K · in - · out -

Deepseek OCR 2

Mar 2026

Deepseek chat model. https://huggingface.co/api/models/deepseek-ai/DeepSeek-OCR-2

8K · in - · out -

Nvidia Nemotron 3 Super 120B A12b Fp8

Mar 2026

Nvidia chat model. https://huggingface.co/api/models/nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-FP8

262K · in - · out -

Qwen3.5 9B FP8

Mar 2026

Qwen chat model. https://huggingface.co/api/models/togethercomputer/Qwen3.5-9B-FP8-MLP

262K · in $0.17 · out $0.25

Qwen3.5 9B Fp8

Mar 2026

Qwen chat model. https://huggingface.co/api/models/togethercomputer/Qwen3.5-9B-FP8-MLP

262K · in - · out -

Glm 4.7 Fp8

Mar 2026

Zai Org chat model. https://huggingface.co/api/models/zai-org/GLM-4.7-FP8

203K · in - · out -

Qwen3.5 35B A3b

Feb 2026

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3.5-35B-A3B

262K · in - · out -

Qwen3.5 397B A17b

Feb 2026

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3.5-397B-A17B

262K · in $0.6 · out $3.6

MiniMax M2.5 FP4

Feb 2026

MiniMaxAI chat model.

8K · in - · out -

GLM 5 Fp4

Feb 2026

Zai Org chat model. https://huggingface.co/api/models/togethercomputer/GLM-5-FP4

203K · in - · out -

GLM 5 Fp4

Feb 2026

Zai Org chat model. https://huggingface.co/api/models/togethercomputer/GLM-5-FP4

203K · in $1 · out $3.2

GLM OCR

Feb 2026

Zai Org chat model. https://huggingface.co/api/models/zai-org/GLM-OCR

131K · in - · out -

GLM 4.7 FP8

Dec 2025

Zai Org chat model. https://huggingface.co/api/models/zai-org/GLM-4.7-fp8

203K · in $0.45 · out $2

Nvidia Nemotron 3 Nano 30B A3b Bf16

Dec 2025

Nvidia chat model. https://huggingface.co/api/models/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16

262K · in - · out -

EssentialAI Rnj-1 Instruct

Dec 2025

Essential AI chat model. https://huggingface.co/api/models/togethercomputer/EssentialAI-RNJ-1-Instruct

33K · in - · out -

Ministral 3 14B Instruct 2512

Dec 2025

Mistralai chat model. https://huggingface.co/api/models/mistralai/Ministral-3-14B-Instruct-2512

262K · in $0.2 · out $0.2

Trinity Mini

Dec 2025

Arcee AI chat model. https://huggingface.co/api/models/togethercomputer/arcee-trinity-mini-rc

128K · in $0.05 · out $0.15

Cogito v2.1 671B

Nov 2025

Deepcogito chat model. https://huggingface.co/api/models/togethercomputer/cogito-671b-v2.1-exp-chkp-2

164K · in $1.25 · out $1.25

Qwen3-VL-235B-A22B-Instruct-FP8

Nov 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-VL-235B-A22B-Instruct-FP8

262K · in - · out -

Medgemma 27B Text It

Oct 2025

Google chat model. https://huggingface.co/api/models/google/medgemma-27b-text-it

131K · in - · out -

Qwen3-VL-32B-Instruct

Oct 2025

Qwen chat model. https://huggingface.co/Qwen/Qwen3-VL-32B-Instruct

262K · in $0.5 · out $1.5

MiniMax M2

Oct 2025

MiniMaxAI chat model. https://huggingface.co/MiniMaxAI/MiniMax-M2

197K · in - · out -

Qwen3-VL-8B-Instruct

Oct 2025

Qwen chat model. https://huggingface.co/Qwen/Qwen3-VL-8B-Instruct

262K · in $0.18 · out $0.68

GLM 4.6 Fp8

Sep 2025

Zai Org chat model. https://huggingface.co/api/models/zai-org/GLM-4.6-FP8

203K · in $0.6 · out $2.2

Gemma 3 270M It

Sep 2025

Google chat model. https://huggingface.co/api/models/google/gemma-3-270m-it

33K · in - · out -

Qwen3 Next 80B A3b Instruct Fp8

Sep 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-Next-80B-A3B-Instruct-FP8

- · in - · out -

Nvidia Nemotron Nano 9B V2

Sep 2025

Nvidia chat model. https://huggingface.co/api/models/nvidia/NVIDIA-Nemotron-Nano-9B-v2

131K · in $0.06 · out $0.25

Qwen3 Next 80B A3b Instruct

Sep 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-Next-80B-A3B-Instruct

262K · in $0.15 · out $1.5

Qwen3 Next 80B A3b Thinking

Sep 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-Next-80B-A3B-Thinking

262K · in $0.15 · out $1.5

GLM 4.5V

Aug 2025

Zai Org chat model. https://huggingface.co/api/models/zai-org/GLM-4.5V

66K · in - · out -

Qwen3 4B Instruct 2507

Aug 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-4B-Instruct-2507

262K · in - · out -

OpenAI GPT-OSS 20B

Aug 2025

OpenAI chat model. https://huggingface.co/api/models/openai/gpt-oss-20b

131K · in $0.05 · out $0.2

OpenAI GPT-OSS 120B

Aug 2025

OpenAI chat model. https://huggingface.co/openai/gpt-oss-120b

131K · in $0.15 · out $0.6

Qwen3 Coder 30B A3b Instruct

Jul 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-Coder-30B-A3B-Instruct

262K · in - · out -

Glm 4.5 Air Fp8

Jul 2025

Zai Org chat model. https://huggingface.co/api/models/zai-org/GLM-4.5-Air-FP8

131K · in $0.2 · out $1.1

Qwen3 235B A22b Instruct 2507 Fp8

Jul 2025

Together AI chat model. https://huggingface.co/api/models/Qwen/Qwen3-235B-A22B-Instruct-2507-FP8

262K · in - · out -

Qwen3 Coder 480B A35B Instruct Fp8

Jul 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8

262K · in $2 · out $2

Qwen3 235B A22B Instruct 2507 FP8 Throughput

deprecated

Jul 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-235B-A22B-Instruct-2507-FP8

262K · in $0.2 · out $0.6

Sarvam M

Jul 2025

Sarvamai chat model. https://huggingface.co/api/models/sarvamai/sarvam-m

33K · in - · out -

Meta Llama 3.1 8B Instruct Awq Int4

Jul 2025

Meta chat model. https://huggingface.co/api/models/togethercomputer/meta-llama-3.1-8B-Instruct-AWQ-INT4

131K · in - · out -

Minimax M1 80K

Jun 2025

MiniMaxAI chat model. https://huggingface.co/api/models/togethercomputer/MiniMax-M1-80k

1M · in - · out -

Minimax M1 40K

Jun 2025

MiniMaxAI chat model. https://huggingface.co/api/models/togethercomputer/MiniMax-M1-40k

1M · in - · out -

Llama 4 Scout (17Bx16E)

Jun 2025

Meta chat model. https://huggingface.co/api/models/meta-llama/Llama-4-Scout-17B-16E

262K · in - · out -

Magistral Small 2506

Jun 2025

Mistralai chat model. https://huggingface.co/api/models/mistralai/Magistral-Small-2506

41K · in - · out -

Gemma 2B It

Jun 2025

Google chat model. https://huggingface.co/api/models/google/gemma-2b-it

8K · in - · out -

Gemma 2 9B It

Jun 2025

Google chat model. https://huggingface.co/api/models/google/gemma-2-9b-it

8K · in - · out -

Qwen3 1.7B

Jun 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-1.7B

41K · in - · out -

Qwen3 0.6B

Jun 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-0.6B

41K · in - · out -

DeepSeek R1 0528 NVFP4

May 2025

Deepseek chat model. https://huggingface.co/api/models/deepseek-ai/DeepSeek-R1-0528

164K · in $3 · out $7

Molmo 7B D 0924

May 2025

Allenai chat model. https://huggingface.co/api/models/allenai/Molmo-7B-D-0924

4K · in - · out -

Mixtral 8X22b Instruct V0.1

May 2025

Mistralai chat model. https://huggingface.co/api/models/mistralai/Mixtral-8x22B-Instruct-v0.1

66K · in - · out -

Devstral Small 2505

May 2025

Mistralai chat model. https://huggingface.co/api/models/togethercomputer/Devstral-Small-2505

131K · in - · out -

Mistral 7B v0.1

May 2025

Mistralai chat model. https://huggingface.co/api/models/mistralai/Mistral-7B-v0.1

33K · in - · out -

Gemma 3N E4B Instruct

May 2025

Google chat model. https://huggingface.co/google/gemma-3n-E4B-it

33K · in $0.06 · out $0.12

Deepcoder 14B Preview

May 2025

Togethercomputer chat model. https://huggingface.co/api/models/togethercomputer/DeepCoder-14B-Preview

131K · in - · out -

Qwen3 32B

Apr 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-32B

41K · in - · out -

Qwen3 8B

Apr 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-8B

41K · in - · out -

Qwen3 30B A3b

Apr 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-30B-A3B

41K · in - · out -

Qwen3 14B

Apr 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-14B

2K · in - · out -

Arize AI Qwen 2 1.5B Instruct

Apr 2025

Togethercomputer chat model. https://huggingface.co/api/models/togethercomputer/arize-ai-qwen-2-1.5b-instruct

33K · in $0.1 · out $0.1

Llama 3.1 405B

Apr 2025

Meta chat model. https://huggingface.co/api/models/meta-llama/Llama-3.1-405B

131K · in - · out -

Llama 3.1 70B

Apr 2025

Meta chat model. https://huggingface.co/api/models/meta-llama/Llama-3.1-70B

131K · in - · out -

Qwen2.5 32B

Apr 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-32B

131K · in - · out -

Qwen2.5 14B

Apr 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-14B

131K · in - · out -

Llama 3.2 1B

Apr 2025

Meta chat model. https://huggingface.co/api/models/meta-llama/Llama-3.2-1B

131K · in - · out -

Qwen2.5 1.5B Instruct

Apr 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-1.5B-Instruct

33K · in - · out -

Qwen2.5 7B Instruct

Apr 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-7B-Instruct

33K · in - · out -

Qwen2.5 7B

Apr 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-7B

131K · in - · out -

Qwen2.5 72B

Apr 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-72B

131K · in - · out -

Qwen2.5 1.5B

Apr 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-1.5B

131K · in - · out -

Qwen2.5 32B Instruct

Apr 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-32B-Instruct

33K · in - · out -

Qwen2.5 3B Instruct

Apr 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-3B-Instruct

33K · in - · out -

Qwen2 72B Instruct

Apr 2025

Togethercomputer chat model. https://huggingface.co/api/models/togethercomputer/Qwen2-72B-Instruct

33K · in $0.9 · out $0.9

Cogito V1 Preview Llama 70B

Apr 2025

deepcogito chat model.

131K · in - · out -

Cogito V1 Preview Llama 8B

Apr 2025

deepcogito chat model.

131K · in - · out -

Cogito V1 Preview Qwen 32B

Apr 2025

deepcogito chat model.

131K · in - · out -

Cogito V1 Preview Llama 70B Turbo

Apr 2025

deepcogito chat model.

131K · in - · out -

Cogito V1 Preview Qwen 14B

Apr 2025

deepcogito chat model.

131K · in - · out -

Llama 4 Scout Instruct (17Bx16E)

Apr 2025

Meta chat model. https://huggingface.co/meta-llama/Llama-4-Scout-17B-16E-Instruct

1M · in $0.18 · out $0.59

Gemma 3 1b it

Apr 2025

Google chat model.

33K · in - · out -

DeepSeek R1 Distill Qwen 7B

Apr 2025

Deepseek chat model. https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B

131K · in - · out -

meta-llama/Llama-2-7b-chat-hf

Apr 2025

Meta chat model. https://huggingface.co/meta-llama/Llama-2-7b-chat-hf

4K · in - · out -

nim/nvidia/llama-3.3-nemotron-super-49b-v1

Mar 2025

Nvidia chat model.

16K · in - · out -

nim/meta/llama-3.1-70b-instruct

Mar 2025

Llama chat model.

16K · in - · out -

nim/nv-mistralai/mistral-nemo-12b-instruct

Mar 2025

NVIDIA chat model.

16K · in - · out -

nim/mistralai/mixtral-8x7b-instruct-v01

Mar 2025

mistralai chat model.

16K · in - · out -

Gemma 3 4b it

Mar 2025

Google chat model.

66K · in - · out -

nim/meta/llama-3.1-8b-instruct

Mar 2025

Meta chat model.

16K · in - · out -

nim/meta/llama-3.3-70b-instruct

Mar 2025

Meta chat model.

16K · in - · out -

Gemma 3 27B It

Mar 2025

Google chat model. https://huggingface.co/api/models/google/gemma-3-27b-it

66K · in - · out -

nim/nvidia/llama-3.1-nemotron-70b-instruct

Mar 2025

NVIDIA chat model.

16K · in - · out -

nim/meta/llama-3.2-11b-vision-instruct

Mar 2025

Nvidia chat model.

16K · in - · out -

nim/meta/llama-3.2-90b-vision-instruct

Mar 2025

Meta chat model.

16K · in - · out -

nim/mistralai/mixtral-8x22b-instruct-v01

Mar 2025

Mistral chat model.

16K · in - · out -

Meta Llama 3.1 8B Instruct Turbo

Mar 2025

Meta chat model. https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct

131K · in $0.18 · out $0.18

Qwen QwQ-32B

Mar 2025

Qwen chat model. https://huggingface.co/Qwen/QwQ-32B

131K · in $1.2 · out $1.2

Qwen2.5-VL (72B) Instruct

Feb 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-VL-72B-Instruct

33K · in $1.95 · out $8

Mistral Small (24B) Instruct 25.01

Jan 2025

mistralai chat model. https://huggingface.co/mistralai/Mistral-Small-Instruct-2501

33K · in $0.1 · out $0.3

DeepSeek R1 Distill Qwen 1.5B

Jan 2025

DeepSeek chat model. https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

131K · in $0.18 · out $0.18

DeepSeek R1 Distill Qwen 14B

Jan 2025

DeepSeek chat model. https://huggingface.co/api/models/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B

131K · in $1.6 · out $1.6

DeepSeek R1 Distill Llama 70B

Jan 2025

DeepSeek chat model. https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B

131K · in $2 · out $2

Qwen2-VL (72B) Instruct

Jan 2025

Qwen chat model. https://huggingface.co/Qwen/Qwen2-VL-72B-Instruct

33K · in $1.2 · out $1.2

Qwen 2.5 14B Instruct

Dec 2024

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-14B-Instruct

33K · in $0.8 · out $0.8

Meta Llama 3.3 70B Instruct

Dec 2024

meta-llama chat model. https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct

131K · in - · out -

Meta Llama 3.1 405B Instruct

Dec 2024

Meta chat model. https://huggingface.co/meta-llama/Llama-3.1-405B-Instruct

4K · in $3.5 · out $3.5

Meta Llama 3.3 70B Instruct Turbo

Dec 2024

Meta chat model. https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct

131K · in $1.04 · out $1.04

Qwen2.5 72B Instruct

Dec 2024

Qwen chat model. https://huggingface.co/Qwen/Qwen2.5-72B-Instruct

33K · in $1.2 · out $1.2

Qwen 2.5 Coder 32B Instruct

Nov 2024

Qwen chat model. https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct

16K · in $0.8 · out $0.8

Llama 3.1 Nemotron 70B Instruct HF

Nov 2024

nvidia chat model. https://huggingface.co/nvidia/Llama-3.1-Nemotron-70B-Instruct-HF

33K · in $0.88 · out $0.88

Qwen2.5 72B Instruct Turbo

Oct 2024

Qwen chat model. https://huggingface.co/Qwen/Qwen2.5-72B-Instruct

131K · in $1.2 · out $1.2

Qwen2.5 7B Instruct Turbo

Oct 2024

Qwen chat model. https://huggingface.co/Qwen/Qwen2.5-7B-Instruct

33K · in $0.3 · out $0.3

Meta Llama 3.2 3B Instruct

Sep 2024

Meta chat model. https://huggingface.co/api/models/meta-llama/Llama-3.2-3B-Instruct

131K · in $0.06 · out $0.06

Meta Llama 3.2 1B Instruct

Sep 2024

Meta chat model. https://huggingface.co/api/models/meta-llama/Llama-3.2-1B-Instruct

131K · in $0.06 · out $0.06

Meta Llama 3.1 70B Instruct Turbo

Jul 2024

Meta chat model. https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct

131K · in $0.88 · out $0.88

Gemma-2 Instruct (27B)

Jul 2024

Google chat model. https://huggingface.co/google/gemma-2b-it

8K · in $0.8 · out $0.8

Qwen 2 Instruct (1.5B)

Jun 2024

Qwen chat model. https://huggingface.co/Qwen/Qwen2-72B-Instruct

33K · in $0.02 · out $0.02

Mistral (7B) Instruct v0.3

May 2024

mistralai chat model. https://huggingface.co/api/models/mistralai/Mistral-7B-Instruct-v0.3

33K · in $0.2 · out $0.2

Meta Llama 3 8B Instruct

Apr 2024

Meta chat model. https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct

8K · in $0.2 · out $0.2

Meta Llama 3 8B Instruct Reference

Apr 2024

Meta chat model. https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct

8K · in $0.2 · out $0.2

Deepseek Coder 33B Instruct

Feb 2024

Deepseek chat model. https://huggingface.co/api/models/deepseek-ai/deepseek-coder-33b-instruct

16K · in $0.8 · out $0.8

Nous Hermes 2 Mixtral 8X7B Dpo

Jan 2024

Nousresearch chat model. https://huggingface.co/api/models/NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO

33K · in $0.6 · out $0.6

Mixtral-8x7B Instruct v0.1

Dec 2023

mistralai chat model. https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1

33K · in $0.6 · out $0.6

Mistral (7B) Instruct v0.1

Sep 2023

mistralai chat model. https://huggingface.co/api/models/mistralai/Mistral-7B-Instruct-v0.1

33K · in $0.2 · out $0.2

LFM2.5-8B-A1B

LiquidAI chat model. https://huggingface.co/api/models/LiquidAI/LFM2.5-8B-A1B

128K · in $0.03 · out $0.12

Qwen3.5 35B A3B Lora

Qwen chat model.

262K · in - · out -

Ternary Bonsai 27B

deprecated

Prism ML chat model. https://huggingface.co/api/models/prism-ml/Ternary-Bonsai-27B-AWQ-4bit

262K · in - · out -

Gemma 4 12B It

Google chat model. https://huggingface.co/google/gemma-4-12B-it

262K · in - · out -

Meta Llama 3 70B Instruct Turbo

Meta chat model. https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct

8K · in $0.88 · out $0.88

Qwen3 Coder Next Fp8

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-Coder-Next-FP8

262K · in $0.5 · out $1.2

LFM2-24B-A2B

deprecated

Togethercomputer chat model.

33K · in $0.03 · out $0.12

Kimi K2.5 Fp4

Togethercomputer chat model. https://huggingface.co/api/models/togethercomputer/Kimi-K2.5-fp4

262K · in $0.5 · out $2.8

Meta Llama 3 8B Instruct Lite

deprecated

Meta chat model. https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct

8K · in $0.14 · out $0.14

165 models · sorted by release date · prices in USD per 1M tokens · refreshed every 30 minutesCompare every model across vendors →

Get started in 3 steps

Create an API key at the Together AI console.

Paste it into Big-AGI's model settings.

Start chatting, or Beam it against other models and fuse the answers.

Running Together AI in Big-AGI

Add your Together AI key and pick from hundreds of open models on Together's serverless infrastructure, priced per token with no subscription. Big-AGI adds no markup and no intermediary: the billing relationship runs directly between you and Together AI.

Your key, your billing. Usage is billed by Together AI to your account.
Fully dynamic catalog. Pricing, context length, and model labels all come straight from Together's live API, so new models, Llama, DeepSeek, Qwen, Kimi included, show up the moment Together ships them, no hand-maintained list required.
Vision figured out for you. Together's API doesn't publish which models accept images, so Big-AGI infers vision support from known multimodal families and naming, and those models correctly show an image-upload option.
Serious serving speed. Together runs its own optimized inference stack, so open models answer at closed-model latencies.

Why Big-AGI instead of the playground?

Together's playground is a quick way to try one model. Big-AGI turns your key into a full workspace: persistent chats, personas, and attachments, layered over the raw API. Run a Together model in Beam next to Claude, GPT, and Gemini, something the playground was never built to do, while the parameters, the key, and the chats stay under your control.

Your keys and your data

Turn on Direct Connection and the browser talks to Together AI directly, skipping the Big-AGI server, when your key is client-side and Together allows it. Your keys stay in your browser. Chats are stored locally first, and sync only if you turn it on. The AI Inspector shows the exact request, the token counts, and a cost estimate.

Together AI in Beam

Fan a prompt out across a few of Together's open models, or set one against Claude, GPT, and Gemini. Fusions then combine, cross-check, and synthesize the parallel answers instead of just picking the best one. Parallel runs use more tokens than a single chat.

Bring your Together AI key. Keep control.

Your key, your data, your choice of model. Big-AGI is open source and self-hostable, so you can check exactly how Together AI is called.

Launch Big-AGI

<- All Models

Alibaba

Anthropic

AWS Bedrock

Azure

Cerebras

DeepSeek

Fireworks AI

Google Gemini

Groq

MiniMax

Mistral

Moonshot

OpenAI

OpenRouter

Perplexity

Sakana AI

SpaceXAI

Together AI

Z.ai

BIG-AGI

Product

Features Models Controls Changelog BEAM Technology

Resources

Documentation Discord GitHub

Company

Email Us Privacy Terms