Bring your own Fireworks AI key and use Fireworks AI at its own API rates, with no markup. Keys and chats stay in your browser. Run Fireworks AI in parallel with other models, then compare and merge the answers.
Glm 5p2
NEWfireworks `HF_BASE_MODEL` type.
1M
-
-
Jun 2026
Deepseek V4 Pro
fireworks `HF_BASE_MODEL` type.
1M
-
-
Apr 2026
Kimi K2p6 (Vision)
fireworks `HF_BASE_MODEL` type.
262K
-
-
Apr 2026
Glm 5p1
fireworks `HF_BASE_MODEL` type.
203K
-
-
Mar 2026
Kimi K2p5 (Vision)
fireworks `HF_BASE_MODEL` type.
262K
-
-
Jan 2026
Gpt Oss 120b
fireworks `HF_BASE_MODEL` type.
131K
-
-
Aug 2025
Glm 5p2
NEWfireworks `HF_BASE_MODEL` type.
Deepseek V4 Pro
fireworks `HF_BASE_MODEL` type.
Kimi K2p6 (Vision)
fireworks `HF_BASE_MODEL` type.
Glm 5p1
fireworks `HF_BASE_MODEL` type.
Kimi K2p5 (Vision)
fireworks `HF_BASE_MODEL` type.
Gpt Oss 120b
fireworks `HF_BASE_MODEL` type.
1
Create an API key at the Fireworks AI console.
2
Paste it into Big-AGI's model settings.
3
Start chatting, or Beam it against other models and fuse the answers.
Add your Fireworks API key over its OpenAI-compatible endpoint and reach the open-model catalog at Fireworks' own rates. Big-AGI adds no markup and no intermediary: the billing relationship runs directly between you and Fireworks.
Fireworks' playground is built for one prompt at a time. Big-AGI turns the same key into a persistent workspace: chats that stick around, personas, and file and image attachments, all layered on top of the raw API. It's also the only place a Fireworks model runs next to Claude, GPT, and Gemini in Beam, with the parameters and the key still yours to control.
Turn on Direct Connection and the browser talks to Fireworks directly, skipping the Big-AGI server, when your key is client-side and Fireworks allows it. Your keys stay in your browser. Chats are stored locally first, and sync only if you turn it on. The AI Inspector shows the exact request, the token counts, and a cost estimate, so you always know what you're billed for.
Put a Fireworks model into a Beam alongside frontier labs, or run a few open models side by side at Fireworks' speed. Fusions then combine, cross-check, and synthesize the parallel answers instead of just picking the best one. Parallel runs use more tokens than a single chat.
Your key, your data, your choice of model. Big-AGI is open source and self-hostable, so you can check exactly how Fireworks AI is called.
BIG-AGI
Resources
© 2026 Token Fabrics·Built with passion in San Diego