Use Together AI Models in Big-AGI.

Bring your own Together AI key and use Together AI at its own API rates, with no markup. Keys and chats stay in your browser. Run Together AI in parallel with other models, then compare and merge the answers.

Qwen3.5 397B A17b
Qwen3.6 35B A3B Lora
Qwen3.5 2B Lora

All supported Together AI models

ModelContextInputOutputReleased

Qwen3.5 397B A17b

NEW

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3.5-397B-A17B

262K

$0.6

$3.6

Jun 2026

Qwen3.6 35B A3B Lora

NEW

Qwen chat model.

262K

-

-

Jun 2026

Qwen3.5 2B Lora

NEW

Qwen chat model.

262K

-

-

Jun 2026

NVIDIA Nemotron 3 Ultra 550B A55B NVFP4

NEW

NVIDIA chat model.

512K

$0.6

$3.6

Jun 2026

GLM 5.1 FP4

NEW

Zai Org chat model. https://huggingface.co/api/models/togethercomputer/GLM-5.1-FP4

203K

$1.4

$4.4

Jun 2026

GLM 5 Fp4

NEW

Zai Org chat model. https://huggingface.co/api/models/togethercomputer/GLM-5-FP4

203K

$1

$3.2

Jun 2026

Qwen3.7 Plus

NEW

Qwen chat model.

1M

$0.32

$1.28

Jun 2026

GLM 4.7 FP8

NEW

Zai Org chat model. https://huggingface.co/api/models/zai-org/GLM-4.7-fp8

203K

$0.45

$2

Jun 2026

GLM 4.7 FP4

NEW

Zai Org chat model.

203K

-

-

Jun 2026

Llama 4 Maverick 17B 128E Instruct Nvfp4

NEW
Vision

Meta chat model. https://huggingface.co/api/models/RedHatAI/Llama-4-Maverick-17B-128E-Instruct-NVFP4

1M

-

-

Jun 2026

Qwen3.7 Max

Qwen chat model.

1M

$1.25

$3.75

May 2026

Llama 4 Scout 17B 16E Instruct Fp8 Lora

Vision

Meta chat model.

10M

-

-

May 2026

Gemma 4 31B It Lora

Google chat model. https://huggingface.co/api/models/google/gemma-4-31B-it

262K

-

-

May 2026

Trinity Mini

Arcee AI chat model. https://huggingface.co/api/models/togethercomputer/arcee-trinity-mini-rc

128K

$0.05

$0.15

May 2026

Gemma 3 27B It Lora

Google chat model.

-

-

-

May 2026

Pearl-ai Gemma-4-31B-it-pearl

pearl.ai chat model. https://huggingface.co/pearl-ai/Gemma-4-31B-it-pearl

262K

$0.28

$0.86

May 2026

Mixtral 8x7B Instruct V0.1 FP8 Lora

Mistral AI chat model.

33K

-

-

May 2026

Gemma 3 270M It Lora

Google chat model.

33K

-

-

May 2026

Llama 3.3 70B Instruct FP8 Lora

Meta chat model.

131K

-

-

May 2026

Qwen3.6 Plus

Qwen chat model.

1M

$0.5

$3

Apr 2026

Nemotron 3 Nano Omni 30B A3b Reasoning Fp8

Nvidia chat model.

131K

-

-

Apr 2026

Qwen3.6 35B A3b Fp8

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3.6-35B-A3B-FP8

262K

-

-

Apr 2026

Gemma 4 E2B-it

Google chat model. https://huggingface.co/api/models/google/gemma-4-E2B-it

131K

-

-

Apr 2026

Gemma 4 26B A4b It

Google chat model. https://huggingface.co/api/models/google/gemma-4-26B-A4B-it

262K

-

-

Apr 2026

Gemma 4 E4B-it

Google chat model. https://huggingface.co/google/gemma-4-E4B-it

131K

-

-

Apr 2026

Nvidia Nemotron 3 Super 120B A12b Bf16

Nvidia chat model. https://huggingface.co/api/models/nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16

262K

-

-

Apr 2026

Holo3 35B A3b

Hcompany chat model. https://huggingface.co/api/models/Hcompany/Holo3-35B-A3B

262K

-

-

Mar 2026

OpenAI GPT-OSS 20B

OpenAI chat model. https://huggingface.co/api/models/openai/gpt-oss-20b

131K

$0.05

$0.2

Mar 2026

Deepseek V3.1 NVFP4

DeepSeek chat model. https://huggingface.co/deepseek-ai/DeepSeek-V3.1

131K

$0.6

$1.7

Mar 2026

Qwen3 30B A3B Instruct 2507 Lora

Qwen chat model.

262K

-

-

Mar 2026

Qwen3 8B Lora

Qwen chat model.

41K

-

-

Mar 2026

Qwen3.5 122B A10b Fp8

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3.5-122B-A10B-FP8

262K

-

-

Mar 2026

GLM OCR

Zai Org chat model. https://huggingface.co/api/models/zai-org/GLM-OCR

131K

-

-

Mar 2026

Deepseek OCR 2

Deepseek chat model. https://huggingface.co/api/models/deepseek-ai/DeepSeek-OCR-2

8K

-

-

Mar 2026

Nvidia Nemotron 3 Super 120B A12b Fp8

Nvidia chat model. https://huggingface.co/api/models/nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-FP8

262K

-

-

Mar 2026

Qwen3.5 35B A3b

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3.5-35B-A3B

262K

-

-

Mar 2026

Qwen3.5 9B Fp8

Qwen chat model. https://huggingface.co/api/models/togethercomputer/Qwen3.5-9B-FP8-MLP

262K

-

-

Mar 2026

Glm 4.7 Fp8

Zai Org chat model. https://huggingface.co/api/models/zai-org/GLM-4.7-FP8

203K

-

-

Mar 2026

MiniMax M2.5 FP4

MiniMaxAI chat model.

8K

-

-

Feb 2026

GLM 5 Fp4

Zai Org chat model. https://huggingface.co/api/models/togethercomputer/GLM-5-FP4

203K

-

-

Feb 2026

Nvidia Nemotron 3 Nano 30B A3b Bf16

Nvidia chat model. https://huggingface.co/api/models/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16

262K

-

-

Dec 2025

EssentialAI Rnj-1 Instruct

Essential AI chat model. https://huggingface.co/api/models/togethercomputer/EssentialAI-RNJ-1-Instruct

33K

-

-

Dec 2025

Ministral 3 14B Instruct 2512

Mistralai chat model. https://huggingface.co/api/models/mistralai/Ministral-3-14B-Instruct-2512

262K

$0.2

$0.2

Dec 2025

Qwen3-VL-235B-A22B-Instruct-FP8

Vision

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-VL-235B-A22B-Instruct-FP8

262K

-

-

Nov 2025

MiniMax M2

MiniMaxAI chat model. https://huggingface.co/MiniMaxAI/MiniMax-M2

197K

-

-

Oct 2025

Qwen3-VL-32B-Instruct

Vision

Qwen chat model. https://huggingface.co/Qwen/Qwen3-VL-32B-Instruct

262K

$0.5

$1.5

Oct 2025

Medgemma 27B Text It

Google chat model. https://huggingface.co/api/models/google/medgemma-27b-text-it

131K

-

-

Oct 2025

GLM 4.6 Fp8

Zai Org chat model. https://huggingface.co/api/models/zai-org/GLM-4.6-FP8

203K

$0.6

$2.2

Oct 2025

Gemma 3 270M It

Google chat model. https://huggingface.co/api/models/google/gemma-3-270m-it

33K

-

-

Sep 2025

Qwen3 Next 80B A3b Instruct Fp8

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-Next-80B-A3B-Instruct-FP8

-

-

-

Sep 2025

Nvidia Nemotron Nano 9B V2

Nvidia chat model. https://huggingface.co/api/models/nvidia/NVIDIA-Nemotron-Nano-9B-v2

131K

$0.06

$0.25

Sep 2025

GLM 4.5V

Zai Org chat model. https://huggingface.co/api/models/zai-org/GLM-4.5V

66K

-

-

Sep 2025

Qwen3 Next 80B A3b Thinking

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-Next-80B-A3B-Thinking

262K

$0.15

$1.5

Sep 2025

Qwen3 4B Instruct 2507

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-4B-Instruct-2507

262K

-

-

Aug 2025

OpenAI GPT-OSS 120B

OpenAI chat model. https://huggingface.co/openai/gpt-oss-120b

131K

$0.15

$0.6

Aug 2025

Qwen3 Coder 30B A3b Instruct

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-Coder-30B-A3B-Instruct

262K

-

-

Aug 2025

Glm 4.5 Air Fp8

Zai Org chat model. https://huggingface.co/api/models/zai-org/GLM-4.5-Air-FP8

131K

$0.2

$1.1

Jul 2025

Qwen3 235B A22b Instruct 2507 Fp8

Together AI chat model. https://huggingface.co/api/models/Qwen/Qwen3-235B-A22B-Instruct-2507-FP8

262K

-

-

Jul 2025

Qwen3 Coder 480B A35B Instruct Fp8

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8

262K

$2

$2

Jul 2025

Qwen3 235B A22B Instruct 2507 FP8 Throughput

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-235B-A22B-Instruct-2507-FP8

262K

$0.2

$0.6

Jul 2025

Qwen3 32B

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-32B

41K

-

-

Jul 2025

DeepSeek R1 0528 NVFP4

Deepseek chat model. https://huggingface.co/api/models/deepseek-ai/DeepSeek-R1-0528

164K

$3

$7

Jul 2025

Sarvam M

Sarvamai chat model. https://huggingface.co/api/models/sarvamai/sarvam-m

33K

-

-

Jul 2025

Meta Llama 3.1 8B Instruct Awq Int4

Meta chat model. https://huggingface.co/api/models/togethercomputer/meta-llama-3.1-8B-Instruct-AWQ-INT4

131K

-

-

Jul 2025

Minimax M1 80K

MiniMaxAI chat model. https://huggingface.co/api/models/togethercomputer/MiniMax-M1-80k

1M

-

-

Jun 2025

Gemma 3N E4B Instruct

Google chat model. https://huggingface.co/google/gemma-3n-E4B-it

33K

$0.06

$0.12

Jun 2025

Minimax M1 40K

MiniMaxAI chat model. https://huggingface.co/api/models/togethercomputer/MiniMax-M1-40k

1M

-

-

Jun 2025

Llama 4 Scout (17Bx16E)

Vision

Meta chat model. https://huggingface.co/api/models/meta-llama/Llama-4-Scout-17B-16E

262K

-

-

Jun 2025

Magistral Small 2506

Mistralai chat model. https://huggingface.co/api/models/mistralai/Magistral-Small-2506

41K

-

-

Jun 2025

Gemma 2 9B It

Google chat model. https://huggingface.co/api/models/google/gemma-2-9b-it

8K

-

-

Jun 2025

Gemma 2B It

Google chat model. https://huggingface.co/api/models/google/gemma-2b-it

8K

-

-

Jun 2025

Qwen3 1.7B

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-1.7B

41K

-

-

Jun 2025

Qwen3 30B A3b

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-30B-A3B

41K

-

-

Jun 2025

Qwen3 0.6B

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-0.6B

41K

-

-

Jun 2025

Qwen3 14B

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-14B

2K

-

-

May 2025

Qwen3 8B

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-8B

41K

-

-

May 2025

Molmo 7B D 0924

Allenai chat model. https://huggingface.co/api/models/allenai/Molmo-7B-D-0924

4K

-

-

May 2025

Mixtral 8X22b Instruct V0.1

Mistralai chat model. https://huggingface.co/api/models/mistralai/Mixtral-8x22B-Instruct-v0.1

66K

-

-

May 2025

Devstral Small 2505

Mistralai chat model. https://huggingface.co/api/models/togethercomputer/Devstral-Small-2505

131K

-

-

May 2025

Mistral 7B v0.1

Mistralai chat model. https://huggingface.co/api/models/mistralai/Mistral-7B-v0.1

33K

-

-

May 2025

Deepcoder 14B Preview

Togethercomputer chat model. https://huggingface.co/api/models/togethercomputer/DeepCoder-14B-Preview

131K

-

-

May 2025

Arize AI Qwen 2 1.5B Instruct

Togethercomputer chat model. https://huggingface.co/api/models/togethercomputer/arize-ai-qwen-2-1.5b-instruct

33K

$0.1

$0.1

Apr 2025

Llama 3.1 405B

Meta chat model. https://huggingface.co/api/models/meta-llama/Llama-3.1-405B

131K

-

-

Apr 2025

Llama 3.1 70B

Meta chat model. https://huggingface.co/api/models/meta-llama/Llama-3.1-70B

131K

-

-

Apr 2025

Llama 3.2 1B

Meta chat model. https://huggingface.co/api/models/meta-llama/Llama-3.2-1B

131K

-

-

Apr 2025

Qwen2.5 32B

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-32B

131K

-

-

Apr 2025

Qwen2.5 72B

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-72B

131K

-

-

Apr 2025

Qwen2.5 14B

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-14B

131K

-

-

Apr 2025

Qwen2.5 32B Instruct

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-32B-Instruct

33K

-

-

Apr 2025

Qwen2.5 3B Instruct

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-3B-Instruct

33K

-

-

Apr 2025

Qwen2.5 7B Instruct

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-7B-Instruct

33K

-

-

Apr 2025

Qwen2.5 1.5B Instruct

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-1.5B-Instruct

33K

-

-

Apr 2025

Qwen2.5 1.5B

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-1.5B

131K

-

-

Apr 2025

Qwen2.5 7B

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-7B

131K

-

-

Apr 2025

Meta Llama 3.3 70B Instruct

meta-llama chat model. https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct

131K

-

-

Apr 2025

Qwen2 72B Instruct

Togethercomputer chat model. https://huggingface.co/api/models/togethercomputer/Qwen2-72B-Instruct

33K

$0.9

$0.9

Apr 2025

Cogito V1 Preview Llama 70B Turbo

deepcogito chat model.

131K

-

-

Apr 2025

Cogito V1 Preview Llama 8B

deepcogito chat model.

131K

-

-

Apr 2025

Cogito V1 Preview Qwen 32B

deepcogito chat model.

131K

-

-

Apr 2025

Cogito V1 Preview Qwen 14B

deepcogito chat model.

131K

-

-

Apr 2025

Cogito V1 Preview Llama 70B

deepcogito chat model.

131K

-

-

Apr 2025

Llama 4 Scout Instruct (17Bx16E)

Vision

Meta chat model. https://huggingface.co/meta-llama/Llama-4-Scout-17B-16E-Instruct

1M

$0.18

$0.59

Apr 2025

Gemma 3 4b it

Google chat model.

66K

-

-

Apr 2025

Gemma 3 1b it

Google chat model.

33K

-

-

Apr 2025

DeepSeek R1 Distill Qwen 7B

Deepseek chat model. https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B

131K

-

-

Apr 2025

meta-llama/Llama-2-7b-chat-hf

Meta chat model. https://huggingface.co/meta-llama/Llama-2-7b-chat-hf

4K

-

-

Apr 2025

Gemma 3 27B It

Google chat model. https://huggingface.co/api/models/google/gemma-3-27b-it

66K

-

-

Mar 2025

Qwen2.5-VL (72B) Instruct

Vision

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-VL-72B-Instruct

33K

$1.95

$8

Mar 2025

nim/nvidia/llama-3.3-nemotron-super-49b-v1

Nvidia chat model.

16K

-

-

Mar 2025

nim/nv-mistralai/mistral-nemo-12b-instruct

NVIDIA chat model.

16K

-

-

Mar 2025

nim/mistralai/mixtral-8x7b-instruct-v01

mistralai chat model.

16K

-

-

Mar 2025

nim/meta/llama-3.1-70b-instruct

Llama chat model.

16K

-

-

Mar 2025

nim/meta/llama-3.1-8b-instruct

Meta chat model.

16K

-

-

Mar 2025

nim/meta/llama-3.3-70b-instruct

Meta chat model.

16K

-

-

Mar 2025

nim/nvidia/llama-3.1-nemotron-70b-instruct

NVIDIA chat model.

16K

-

-

Mar 2025

nim/meta/llama-3.2-11b-vision-instruct

Vision

Nvidia chat model.

16K

-

-

Mar 2025

nim/meta/llama-3.2-90b-vision-instruct

Vision

Meta chat model.

16K

-

-

Mar 2025

nim/mistralai/mixtral-8x22b-instruct-v01

Mistral chat model.

16K

-

-

Mar 2025

Meta Llama 3.1 8B Instruct Turbo

Meta chat model. https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct

131K

$0.18

$0.18

Mar 2025

Qwen QwQ-32B

Qwen chat model. https://huggingface.co/Qwen/QwQ-32B

131K

$1.2

$1.2

Mar 2025

Mistral Small (24B) Instruct 25.01

mistralai chat model. https://huggingface.co/mistralai/Mistral-Small-Instruct-2501

33K

$0.1

$0.3

Jan 2025

DeepSeek R1 Distill Qwen 1.5B

DeepSeek chat model. https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

131K

$0.18

$0.18

Jan 2025

DeepSeek R1 Distill Qwen 14B

DeepSeek chat model. https://huggingface.co/api/models/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B

131K

$1.6

$1.6

Jan 2025

DeepSeek R1 Distill Llama 70B

DeepSeek chat model. https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B

131K

$2

$2

Jan 2025

Qwen2-VL (72B) Instruct

Vision

Qwen chat model. https://huggingface.co/Qwen/Qwen2-VL-72B-Instruct

33K

$1.2

$1.2

Jan 2025

Qwen 2.5 14B Instruct

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-14B-Instruct

33K

$0.8

$0.8

Dec 2024

Meta Llama 3.3 70B Instruct Turbo

Meta chat model. https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct

131K

$1.04

$1.04

Dec 2024

Qwen2.5 72B Instruct

Qwen chat model. https://huggingface.co/Qwen/Qwen2.5-72B-Instruct

33K

$1.2

$1.2

Dec 2024

Meta Llama 3.2 1B Instruct

Meta chat model. https://huggingface.co/api/models/meta-llama/Llama-3.2-1B-Instruct

131K

$0.06

$0.06

Dec 2024

Meta Llama 3.2 3B Instruct

Meta chat model. https://huggingface.co/api/models/meta-llama/Llama-3.2-3B-Instruct

131K

$0.06

$0.06

Dec 2024

Meta Llama 3.1 405B Instruct

Meta chat model. https://huggingface.co/meta-llama/Llama-3.1-405B-Instruct

4K

$3.5

$3.5

Dec 2024

Qwen 2.5 Coder 32B Instruct

Qwen chat model. https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct

16K

$0.8

$0.8

Nov 2024

Llama 3.1 Nemotron 70B Instruct HF

nvidia chat model. https://huggingface.co/nvidia/Llama-3.1-Nemotron-70B-Instruct-HF

33K

$0.88

$0.88

Nov 2024

Qwen2.5 7B Instruct Turbo

Qwen chat model. https://huggingface.co/Qwen/Qwen2.5-7B-Instruct

33K

$0.3

$0.3

Oct 2024

Qwen2.5 72B Instruct Turbo

Qwen chat model. https://huggingface.co/Qwen/Qwen2.5-72B-Instruct

131K

$1.2

$1.2

Oct 2024

Meta Llama 3.1 70B Instruct Turbo

Meta chat model. https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct

131K

$0.88

$0.88

Jul 2024

Qwen 2 Instruct (1.5B)

Qwen chat model. https://huggingface.co/Qwen/Qwen2-72B-Instruct

33K

$0.02

$0.02

Jun 2024

Mistral (7B) Instruct v0.3

mistralai chat model. https://huggingface.co/api/models/mistralai/Mistral-7B-Instruct-v0.3

33K

$0.2

$0.2

May 2024

Meta Llama 3 8B Instruct Reference

Meta chat model. https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct

8K

$0.2

$0.2

Apr 2024

Meta Llama 3 8B Instruct

Meta chat model. https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct

8K

$0.2

$0.2

Apr 2024

Gemma-2 Instruct (27B)

Google chat model. https://huggingface.co/google/gemma-2b-it

8K

$0.8

$0.8

Feb 2024

Deepseek Coder 33B Instruct

Deepseek chat model. https://huggingface.co/api/models/deepseek-ai/deepseek-coder-33b-instruct

16K

$0.8

$0.8

Feb 2024

Nous Hermes 2 Mixtral 8X7B Dpo

Nousresearch chat model. https://huggingface.co/api/models/NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO

33K

$0.6

$0.6

Jan 2024

Mixtral-8x7B Instruct v0.1

mistralai chat model. https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1

33K

$0.6

$0.6

Dec 2023

Mistral (7B) Instruct v0.1

mistralai chat model. https://huggingface.co/api/models/mistralai/Mistral-7B-Instruct-v0.1

33K

$0.2

$0.2

Sep 2023

Deepseek V4 Pro

Deepseek chat model.

512K

$1.74

$3.48

-

GLM 5.2

Zai Org chat model. https://huggingface.co/api/models/togethercomputer/GLM-5.2-FP4-0617-skipedge

262K

$1.4

$4.4

-

MiniMax M3

MiniMaxAI chat model.

524K

$0.3

$1.2

-

Kimi K2.7 Code

Moonshot AI chat model. https://huggingface.co/api/models/togethercomputer/Kimi-K2.7-Code-FP4

262K

$0.95

$4

-

Kimi K2.6 Fp4

Moonshot AI chat model. https://huggingface.co/api/models/togethercomputer/Kimi-K2.6-FP4

262K

$1.2

$4.5

-

Qwen3 Coder Next Fp8

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-Coder-Next-FP8

262K

$0.5

$1.2

-

Cogito v2.1 671B

Deepcogito chat model. https://huggingface.co/api/models/togethercomputer/cogito-671b-v2.1-exp-chkp-2

164K

$1.25

$1.25

-

Qwen3 Next 80B A3b Instruct

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-Next-80B-A3B-Instruct

262K

$0.15

$1.5

-

LFM2-24B-A2B

Togethercomputer chat model.

33K

$0.03

$0.12

-

Meta Llama 3 8B Instruct Lite

Meta chat model. https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct

8K

$0.14

$0.14

-

Qwen3.5 35B A3B Lora

Qwen chat model.

262K

-

-

-

Meta Llama 3 70B Instruct Turbo

Meta chat model. https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct

8K

$0.88

$0.88

-

MiniMax M2.7 FP4

MiniMaxAI chat model. https://huggingface.co/api/models/togethercomputer/M2.5plus-fp4

197K

$0.3

$1.2

-

Qwen3.5 9B FP8

Qwen chat model. https://huggingface.co/api/models/togethercomputer/Qwen3.5-9B-FP8-MLP

262K

$0.17

$0.25

-

Qwen3-VL-8B-Instruct

Vision

Qwen chat model. https://huggingface.co/Qwen/Qwen3-VL-8B-Instruct

262K

$0.18

$0.68

-

Kimi K2.5 Fp4

Togethercomputer chat model. https://huggingface.co/api/models/togethercomputer/Kimi-K2.5-fp4

262K

$0.5

$2.8

-

Qwen3.5 397B A17b

NEW
Jun 2026

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3.5-397B-A17B

262K · in $0.6 · out $3.6

Qwen3.6 35B A3B Lora

NEW
Jun 2026

Qwen chat model.

262K · in - · out -

Qwen3.5 2B Lora

NEW
Jun 2026

Qwen chat model.

262K · in - · out -

NVIDIA Nemotron 3 Ultra 550B A55B NVFP4

NEW
Jun 2026

NVIDIA chat model.

512K · in $0.6 · out $3.6

GLM 5.1 FP4

NEW
Jun 2026

Zai Org chat model. https://huggingface.co/api/models/togethercomputer/GLM-5.1-FP4

203K · in $1.4 · out $4.4

GLM 5 Fp4

NEW
Jun 2026

Zai Org chat model. https://huggingface.co/api/models/togethercomputer/GLM-5-FP4

203K · in $1 · out $3.2

Qwen3.7 Plus

NEW
Jun 2026

Qwen chat model.

1M · in $0.32 · out $1.28

GLM 4.7 FP8

NEW
Jun 2026

Zai Org chat model. https://huggingface.co/api/models/zai-org/GLM-4.7-fp8

203K · in $0.45 · out $2

GLM 4.7 FP4

NEW
Jun 2026

Zai Org chat model.

203K · in - · out -

Llama 4 Maverick 17B 128E Instruct Nvfp4

NEW
Jun 2026

Meta chat model. https://huggingface.co/api/models/RedHatAI/Llama-4-Maverick-17B-128E-Instruct-NVFP4

Vision
1M · in - · out -

Qwen3.7 Max

May 2026

Qwen chat model.

1M · in $1.25 · out $3.75

Llama 4 Scout 17B 16E Instruct Fp8 Lora

May 2026

Meta chat model.

Vision
10M · in - · out -

Gemma 4 31B It Lora

May 2026

Google chat model. https://huggingface.co/api/models/google/gemma-4-31B-it

262K · in - · out -

Trinity Mini

May 2026

Arcee AI chat model. https://huggingface.co/api/models/togethercomputer/arcee-trinity-mini-rc

128K · in $0.05 · out $0.15

Gemma 3 27B It Lora

May 2026

Google chat model.

- · in - · out -

Pearl-ai Gemma-4-31B-it-pearl

May 2026

pearl.ai chat model. https://huggingface.co/pearl-ai/Gemma-4-31B-it-pearl

262K · in $0.28 · out $0.86

Mixtral 8x7B Instruct V0.1 FP8 Lora

May 2026

Mistral AI chat model.

33K · in - · out -

Gemma 3 270M It Lora

May 2026

Google chat model.

33K · in - · out -

Llama 3.3 70B Instruct FP8 Lora

May 2026

Meta chat model.

131K · in - · out -

Qwen3.6 Plus

Apr 2026

Qwen chat model.

1M · in $0.5 · out $3

Nemotron 3 Nano Omni 30B A3b Reasoning Fp8

Apr 2026

Nvidia chat model.

131K · in - · out -

Qwen3.6 35B A3b Fp8

Apr 2026

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3.6-35B-A3B-FP8

262K · in - · out -

Gemma 4 E2B-it

Apr 2026

Google chat model. https://huggingface.co/api/models/google/gemma-4-E2B-it

131K · in - · out -

Gemma 4 26B A4b It

Apr 2026

Google chat model. https://huggingface.co/api/models/google/gemma-4-26B-A4B-it

262K · in - · out -

Gemma 4 E4B-it

Apr 2026

Google chat model. https://huggingface.co/google/gemma-4-E4B-it

131K · in - · out -

Nvidia Nemotron 3 Super 120B A12b Bf16

Apr 2026

Nvidia chat model. https://huggingface.co/api/models/nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16

262K · in - · out -

Holo3 35B A3b

Mar 2026

Hcompany chat model. https://huggingface.co/api/models/Hcompany/Holo3-35B-A3B

262K · in - · out -

OpenAI GPT-OSS 20B

Mar 2026

OpenAI chat model. https://huggingface.co/api/models/openai/gpt-oss-20b

131K · in $0.05 · out $0.2

Deepseek V3.1 NVFP4

Mar 2026

DeepSeek chat model. https://huggingface.co/deepseek-ai/DeepSeek-V3.1

131K · in $0.6 · out $1.7

Qwen3 30B A3B Instruct 2507 Lora

Mar 2026

Qwen chat model.

262K · in - · out -

Qwen3 8B Lora

Mar 2026

Qwen chat model.

41K · in - · out -

Qwen3.5 122B A10b Fp8

Mar 2026

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3.5-122B-A10B-FP8

262K · in - · out -

GLM OCR

Mar 2026

Zai Org chat model. https://huggingface.co/api/models/zai-org/GLM-OCR

131K · in - · out -

Deepseek OCR 2

Mar 2026

Deepseek chat model. https://huggingface.co/api/models/deepseek-ai/DeepSeek-OCR-2

8K · in - · out -

Nvidia Nemotron 3 Super 120B A12b Fp8

Mar 2026

Nvidia chat model. https://huggingface.co/api/models/nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-FP8

262K · in - · out -

Qwen3.5 35B A3b

Mar 2026

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3.5-35B-A3B

262K · in - · out -

Qwen3.5 9B Fp8

Mar 2026

Qwen chat model. https://huggingface.co/api/models/togethercomputer/Qwen3.5-9B-FP8-MLP

262K · in - · out -

Glm 4.7 Fp8

Mar 2026

Zai Org chat model. https://huggingface.co/api/models/zai-org/GLM-4.7-FP8

203K · in - · out -

MiniMax M2.5 FP4

Feb 2026

MiniMaxAI chat model.

8K · in - · out -

GLM 5 Fp4

Feb 2026

Zai Org chat model. https://huggingface.co/api/models/togethercomputer/GLM-5-FP4

203K · in - · out -

Nvidia Nemotron 3 Nano 30B A3b Bf16

Dec 2025

Nvidia chat model. https://huggingface.co/api/models/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16

262K · in - · out -

EssentialAI Rnj-1 Instruct

Dec 2025

Essential AI chat model. https://huggingface.co/api/models/togethercomputer/EssentialAI-RNJ-1-Instruct

33K · in - · out -

Ministral 3 14B Instruct 2512

Dec 2025

Mistralai chat model. https://huggingface.co/api/models/mistralai/Ministral-3-14B-Instruct-2512

262K · in $0.2 · out $0.2

Qwen3-VL-235B-A22B-Instruct-FP8

Nov 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-VL-235B-A22B-Instruct-FP8

Vision
262K · in - · out -

MiniMax M2

Oct 2025

MiniMaxAI chat model. https://huggingface.co/MiniMaxAI/MiniMax-M2

197K · in - · out -

Qwen3-VL-32B-Instruct

Oct 2025

Qwen chat model. https://huggingface.co/Qwen/Qwen3-VL-32B-Instruct

Vision
262K · in $0.5 · out $1.5

Medgemma 27B Text It

Oct 2025

Google chat model. https://huggingface.co/api/models/google/medgemma-27b-text-it

131K · in - · out -

GLM 4.6 Fp8

Oct 2025

Zai Org chat model. https://huggingface.co/api/models/zai-org/GLM-4.6-FP8

203K · in $0.6 · out $2.2

Gemma 3 270M It

Sep 2025

Google chat model. https://huggingface.co/api/models/google/gemma-3-270m-it

33K · in - · out -

Qwen3 Next 80B A3b Instruct Fp8

Sep 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-Next-80B-A3B-Instruct-FP8

- · in - · out -

Nvidia Nemotron Nano 9B V2

Sep 2025

Nvidia chat model. https://huggingface.co/api/models/nvidia/NVIDIA-Nemotron-Nano-9B-v2

131K · in $0.06 · out $0.25

GLM 4.5V

Sep 2025

Zai Org chat model. https://huggingface.co/api/models/zai-org/GLM-4.5V

66K · in - · out -

Qwen3 Next 80B A3b Thinking

Sep 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-Next-80B-A3B-Thinking

262K · in $0.15 · out $1.5

Qwen3 4B Instruct 2507

Aug 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-4B-Instruct-2507

262K · in - · out -

OpenAI GPT-OSS 120B

Aug 2025

OpenAI chat model. https://huggingface.co/openai/gpt-oss-120b

131K · in $0.15 · out $0.6

Qwen3 Coder 30B A3b Instruct

Aug 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-Coder-30B-A3B-Instruct

262K · in - · out -

Glm 4.5 Air Fp8

Jul 2025

Zai Org chat model. https://huggingface.co/api/models/zai-org/GLM-4.5-Air-FP8

131K · in $0.2 · out $1.1

Qwen3 235B A22b Instruct 2507 Fp8

Jul 2025

Together AI chat model. https://huggingface.co/api/models/Qwen/Qwen3-235B-A22B-Instruct-2507-FP8

262K · in - · out -

Qwen3 Coder 480B A35B Instruct Fp8

Jul 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8

262K · in $2 · out $2

Qwen3 235B A22B Instruct 2507 FP8 Throughput

Jul 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-235B-A22B-Instruct-2507-FP8

262K · in $0.2 · out $0.6

Qwen3 32B

Jul 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-32B

41K · in - · out -

DeepSeek R1 0528 NVFP4

Jul 2025

Deepseek chat model. https://huggingface.co/api/models/deepseek-ai/DeepSeek-R1-0528

164K · in $3 · out $7

Sarvam M

Jul 2025

Sarvamai chat model. https://huggingface.co/api/models/sarvamai/sarvam-m

33K · in - · out -

Meta Llama 3.1 8B Instruct Awq Int4

Jul 2025

Meta chat model. https://huggingface.co/api/models/togethercomputer/meta-llama-3.1-8B-Instruct-AWQ-INT4

131K · in - · out -

Minimax M1 80K

Jun 2025

MiniMaxAI chat model. https://huggingface.co/api/models/togethercomputer/MiniMax-M1-80k

1M · in - · out -

Gemma 3N E4B Instruct

Jun 2025

Google chat model. https://huggingface.co/google/gemma-3n-E4B-it

33K · in $0.06 · out $0.12

Minimax M1 40K

Jun 2025

MiniMaxAI chat model. https://huggingface.co/api/models/togethercomputer/MiniMax-M1-40k

1M · in - · out -

Llama 4 Scout (17Bx16E)

Jun 2025

Meta chat model. https://huggingface.co/api/models/meta-llama/Llama-4-Scout-17B-16E

Vision
262K · in - · out -

Magistral Small 2506

Jun 2025

Mistralai chat model. https://huggingface.co/api/models/mistralai/Magistral-Small-2506

41K · in - · out -

Gemma 2 9B It

Jun 2025

Google chat model. https://huggingface.co/api/models/google/gemma-2-9b-it

8K · in - · out -

Gemma 2B It

Jun 2025

Google chat model. https://huggingface.co/api/models/google/gemma-2b-it

8K · in - · out -

Qwen3 1.7B

Jun 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-1.7B

41K · in - · out -

Qwen3 30B A3b

Jun 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-30B-A3B

41K · in - · out -

Qwen3 0.6B

Jun 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-0.6B

41K · in - · out -

Qwen3 14B

May 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-14B

2K · in - · out -

Qwen3 8B

May 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-8B

41K · in - · out -

Molmo 7B D 0924

May 2025

Allenai chat model. https://huggingface.co/api/models/allenai/Molmo-7B-D-0924

4K · in - · out -

Mixtral 8X22b Instruct V0.1

May 2025

Mistralai chat model. https://huggingface.co/api/models/mistralai/Mixtral-8x22B-Instruct-v0.1

66K · in - · out -

Devstral Small 2505

May 2025

Mistralai chat model. https://huggingface.co/api/models/togethercomputer/Devstral-Small-2505

131K · in - · out -

Mistral 7B v0.1

May 2025

Mistralai chat model. https://huggingface.co/api/models/mistralai/Mistral-7B-v0.1

33K · in - · out -

Deepcoder 14B Preview

May 2025

Togethercomputer chat model. https://huggingface.co/api/models/togethercomputer/DeepCoder-14B-Preview

131K · in - · out -

Arize AI Qwen 2 1.5B Instruct

Apr 2025

Togethercomputer chat model. https://huggingface.co/api/models/togethercomputer/arize-ai-qwen-2-1.5b-instruct

33K · in $0.1 · out $0.1

Llama 3.1 405B

Apr 2025

Meta chat model. https://huggingface.co/api/models/meta-llama/Llama-3.1-405B

131K · in - · out -

Llama 3.1 70B

Apr 2025

Meta chat model. https://huggingface.co/api/models/meta-llama/Llama-3.1-70B

131K · in - · out -

Llama 3.2 1B

Apr 2025

Meta chat model. https://huggingface.co/api/models/meta-llama/Llama-3.2-1B

131K · in - · out -

Qwen2.5 32B

Apr 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-32B

131K · in - · out -

Qwen2.5 72B

Apr 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-72B

131K · in - · out -

Qwen2.5 14B

Apr 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-14B

131K · in - · out -

Qwen2.5 32B Instruct

Apr 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-32B-Instruct

33K · in - · out -

Qwen2.5 3B Instruct

Apr 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-3B-Instruct

33K · in - · out -

Qwen2.5 7B Instruct

Apr 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-7B-Instruct

33K · in - · out -

Qwen2.5 1.5B Instruct

Apr 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-1.5B-Instruct

33K · in - · out -

Qwen2.5 1.5B

Apr 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-1.5B

131K · in - · out -

Qwen2.5 7B

Apr 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-7B

131K · in - · out -

Meta Llama 3.3 70B Instruct

Apr 2025

meta-llama chat model. https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct

131K · in - · out -

Qwen2 72B Instruct

Apr 2025

Togethercomputer chat model. https://huggingface.co/api/models/togethercomputer/Qwen2-72B-Instruct

33K · in $0.9 · out $0.9

Cogito V1 Preview Llama 70B Turbo

Apr 2025

deepcogito chat model.

131K · in - · out -

Cogito V1 Preview Llama 8B

Apr 2025

deepcogito chat model.

131K · in - · out -

Cogito V1 Preview Qwen 32B

Apr 2025

deepcogito chat model.

131K · in - · out -

Cogito V1 Preview Qwen 14B

Apr 2025

deepcogito chat model.

131K · in - · out -

Cogito V1 Preview Llama 70B

Apr 2025

deepcogito chat model.

131K · in - · out -

Llama 4 Scout Instruct (17Bx16E)

Apr 2025

Meta chat model. https://huggingface.co/meta-llama/Llama-4-Scout-17B-16E-Instruct

Vision
1M · in $0.18 · out $0.59

Gemma 3 4b it

Apr 2025

Google chat model.

66K · in - · out -

Gemma 3 1b it

Apr 2025

Google chat model.

33K · in - · out -

DeepSeek R1 Distill Qwen 7B

Apr 2025

Deepseek chat model. https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B

131K · in - · out -

meta-llama/Llama-2-7b-chat-hf

Apr 2025

Meta chat model. https://huggingface.co/meta-llama/Llama-2-7b-chat-hf

4K · in - · out -

Gemma 3 27B It

Mar 2025

Google chat model. https://huggingface.co/api/models/google/gemma-3-27b-it

66K · in - · out -

Qwen2.5-VL (72B) Instruct

Mar 2025

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-VL-72B-Instruct

Vision
33K · in $1.95 · out $8

nim/nvidia/llama-3.3-nemotron-super-49b-v1

Mar 2025

Nvidia chat model.

16K · in - · out -

nim/nv-mistralai/mistral-nemo-12b-instruct

Mar 2025

NVIDIA chat model.

16K · in - · out -

nim/mistralai/mixtral-8x7b-instruct-v01

Mar 2025

mistralai chat model.

16K · in - · out -

nim/meta/llama-3.1-70b-instruct

Mar 2025

Llama chat model.

16K · in - · out -

nim/meta/llama-3.1-8b-instruct

Mar 2025

Meta chat model.

16K · in - · out -

nim/meta/llama-3.3-70b-instruct

Mar 2025

Meta chat model.

16K · in - · out -

nim/nvidia/llama-3.1-nemotron-70b-instruct

Mar 2025

NVIDIA chat model.

16K · in - · out -

nim/meta/llama-3.2-11b-vision-instruct

Mar 2025

Nvidia chat model.

Vision
16K · in - · out -

nim/meta/llama-3.2-90b-vision-instruct

Mar 2025

Meta chat model.

Vision
16K · in - · out -

nim/mistralai/mixtral-8x22b-instruct-v01

Mar 2025

Mistral chat model.

16K · in - · out -

Meta Llama 3.1 8B Instruct Turbo

Mar 2025

Meta chat model. https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct

131K · in $0.18 · out $0.18

Qwen QwQ-32B

Mar 2025

Qwen chat model. https://huggingface.co/Qwen/QwQ-32B

131K · in $1.2 · out $1.2

Mistral Small (24B) Instruct 25.01

Jan 2025

mistralai chat model. https://huggingface.co/mistralai/Mistral-Small-Instruct-2501

33K · in $0.1 · out $0.3

DeepSeek R1 Distill Qwen 1.5B

Jan 2025

DeepSeek chat model. https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

131K · in $0.18 · out $0.18

DeepSeek R1 Distill Qwen 14B

Jan 2025

DeepSeek chat model. https://huggingface.co/api/models/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B

131K · in $1.6 · out $1.6

DeepSeek R1 Distill Llama 70B

Jan 2025

DeepSeek chat model. https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B

131K · in $2 · out $2

Qwen2-VL (72B) Instruct

Jan 2025

Qwen chat model. https://huggingface.co/Qwen/Qwen2-VL-72B-Instruct

Vision
33K · in $1.2 · out $1.2

Qwen 2.5 14B Instruct

Dec 2024

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen2.5-14B-Instruct

33K · in $0.8 · out $0.8

Meta Llama 3.3 70B Instruct Turbo

Dec 2024

Meta chat model. https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct

131K · in $1.04 · out $1.04

Qwen2.5 72B Instruct

Dec 2024

Qwen chat model. https://huggingface.co/Qwen/Qwen2.5-72B-Instruct

33K · in $1.2 · out $1.2

Meta Llama 3.2 1B Instruct

Dec 2024

Meta chat model. https://huggingface.co/api/models/meta-llama/Llama-3.2-1B-Instruct

131K · in $0.06 · out $0.06

Meta Llama 3.2 3B Instruct

Dec 2024

Meta chat model. https://huggingface.co/api/models/meta-llama/Llama-3.2-3B-Instruct

131K · in $0.06 · out $0.06

Meta Llama 3.1 405B Instruct

Dec 2024

Meta chat model. https://huggingface.co/meta-llama/Llama-3.1-405B-Instruct

4K · in $3.5 · out $3.5

Qwen 2.5 Coder 32B Instruct

Nov 2024

Qwen chat model. https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct

16K · in $0.8 · out $0.8

Llama 3.1 Nemotron 70B Instruct HF

Nov 2024

nvidia chat model. https://huggingface.co/nvidia/Llama-3.1-Nemotron-70B-Instruct-HF

33K · in $0.88 · out $0.88

Qwen2.5 7B Instruct Turbo

Oct 2024

Qwen chat model. https://huggingface.co/Qwen/Qwen2.5-7B-Instruct

33K · in $0.3 · out $0.3

Qwen2.5 72B Instruct Turbo

Oct 2024

Qwen chat model. https://huggingface.co/Qwen/Qwen2.5-72B-Instruct

131K · in $1.2 · out $1.2

Meta Llama 3.1 70B Instruct Turbo

Jul 2024

Meta chat model. https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct

131K · in $0.88 · out $0.88

Qwen 2 Instruct (1.5B)

Jun 2024

Qwen chat model. https://huggingface.co/Qwen/Qwen2-72B-Instruct

33K · in $0.02 · out $0.02

Mistral (7B) Instruct v0.3

May 2024

mistralai chat model. https://huggingface.co/api/models/mistralai/Mistral-7B-Instruct-v0.3

33K · in $0.2 · out $0.2

Meta Llama 3 8B Instruct Reference

Apr 2024

Meta chat model. https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct

8K · in $0.2 · out $0.2

Meta Llama 3 8B Instruct

Apr 2024

Meta chat model. https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct

8K · in $0.2 · out $0.2

Gemma-2 Instruct (27B)

Feb 2024

Google chat model. https://huggingface.co/google/gemma-2b-it

8K · in $0.8 · out $0.8

Deepseek Coder 33B Instruct

Feb 2024

Deepseek chat model. https://huggingface.co/api/models/deepseek-ai/deepseek-coder-33b-instruct

16K · in $0.8 · out $0.8

Nous Hermes 2 Mixtral 8X7B Dpo

Jan 2024

Nousresearch chat model. https://huggingface.co/api/models/NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO

33K · in $0.6 · out $0.6

Mixtral-8x7B Instruct v0.1

Dec 2023

mistralai chat model. https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1

33K · in $0.6 · out $0.6

Mistral (7B) Instruct v0.1

Sep 2023

mistralai chat model. https://huggingface.co/api/models/mistralai/Mistral-7B-Instruct-v0.1

33K · in $0.2 · out $0.2

Deepseek V4 Pro

-

Deepseek chat model.

512K · in $1.74 · out $3.48

GLM 5.2

-

Zai Org chat model. https://huggingface.co/api/models/togethercomputer/GLM-5.2-FP4-0617-skipedge

262K · in $1.4 · out $4.4

MiniMax M3

-

MiniMaxAI chat model.

524K · in $0.3 · out $1.2

Kimi K2.7 Code

-

Moonshot AI chat model. https://huggingface.co/api/models/togethercomputer/Kimi-K2.7-Code-FP4

262K · in $0.95 · out $4

Kimi K2.6 Fp4

-

Moonshot AI chat model. https://huggingface.co/api/models/togethercomputer/Kimi-K2.6-FP4

262K · in $1.2 · out $4.5

Qwen3 Coder Next Fp8

-

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-Coder-Next-FP8

262K · in $0.5 · out $1.2

Cogito v2.1 671B

-

Deepcogito chat model. https://huggingface.co/api/models/togethercomputer/cogito-671b-v2.1-exp-chkp-2

164K · in $1.25 · out $1.25

Qwen3 Next 80B A3b Instruct

-

Qwen chat model. https://huggingface.co/api/models/Qwen/Qwen3-Next-80B-A3B-Instruct

262K · in $0.15 · out $1.5

LFM2-24B-A2B

-

Togethercomputer chat model.

33K · in $0.03 · out $0.12

Meta Llama 3 8B Instruct Lite

-

Meta chat model. https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct

8K · in $0.14 · out $0.14

Qwen3.5 35B A3B Lora

-

Qwen chat model.

262K · in - · out -

Meta Llama 3 70B Instruct Turbo

-

Meta chat model. https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct

8K · in $0.88 · out $0.88

MiniMax M2.7 FP4

-

MiniMaxAI chat model. https://huggingface.co/api/models/togethercomputer/M2.5plus-fp4

197K · in $0.3 · out $1.2

Qwen3.5 9B FP8

-

Qwen chat model. https://huggingface.co/api/models/togethercomputer/Qwen3.5-9B-FP8-MLP

262K · in $0.17 · out $0.25

Qwen3-VL-8B-Instruct

-

Qwen chat model. https://huggingface.co/Qwen/Qwen3-VL-8B-Instruct

Vision
262K · in $0.18 · out $0.68

Kimi K2.5 Fp4

-

Togethercomputer chat model. https://huggingface.co/api/models/togethercomputer/Kimi-K2.5-fp4

262K · in $0.5 · out $2.8
161 models · sorted by release date · prices in USD per 1M tokens · refreshed every 30 minutesCompare every model across vendors →

Get started in 3 steps

1

Create an API key at the Together AI console.

2

Paste it into Big-AGI's model settings.

3

Start chatting, or Beam it against other models and fuse the answers.

Running Together AI in Big-AGI

Add your Together AI key and pick from hundreds of open models on Together's serverless infrastructure, priced per token with no subscription. Big-AGI adds no markup and no intermediary: the billing relationship runs directly between you and Together AI.

  • Your key, your billing. Usage is billed by Together AI to your account.
  • Fully dynamic catalog. Pricing, context length, and model labels all come straight from Together's live API, so new models, Llama, DeepSeek, Qwen, Kimi included, show up the moment Together ships them, no hand-maintained list required.
  • Vision figured out for you. Together's API doesn't publish which models accept images, so Big-AGI infers vision support from known multimodal families and naming, and those models correctly show an image-upload option.
  • Serious serving speed. Together runs its own optimized inference stack, so open models answer at closed-model latencies.

Why Big-AGI instead of the playground?

Together's playground is a quick way to try one model. Big-AGI turns your key into a full workspace: persistent chats, personas, and attachments, layered over the raw API. Run a Together model in Beam next to Claude, GPT, and Gemini, something the playground was never built to do, while the parameters, the key, and the chats stay under your control.

Your keys and your data

Turn on Direct Connection and the browser talks to Together AI directly, skipping the Big-AGI server, when your key is client-side and Together allows it. Your keys stay in your browser. Chats are stored locally first, and sync only if you turn it on. The AI Inspector shows the exact request, the token counts, and a cost estimate.

Together AI in Beam

Fan a prompt out across a few of Together's open models, or set one against Claude, GPT, and Gemini. Fusions then combine, cross-check, and synthesize the parallel answers instead of just picking the best one. Parallel runs use more tokens than a single chat.

Bring your Together AI key. Keep control.

Your key, your data, your choice of model. Big-AGI is open source and self-hostable, so you can check exactly how Together AI is called.

© 2026 Token Fabrics·Built with passion in San Diego