Use Z.ai Models in Big-AGI.

Bring your own Z.ai key and use Z.ai at its own API rates, with no markup. Keys and chats stay in your browser. Run Z.ai in parallel with other models, then compare and merge the answers.

GLM-5.2 (1M)
GLM-5.1
GLM-5V Turbo

All supported Z.ai models

ModelContextInputOutputReleased

GLM-5.2 (1M)

NEW
ReasoningTools / functions

Z.ai 1M-context flagship (744B MoE, 40B activated). Agentic coding with reasoning_effort control (high/max). 1M context, 128K output.

1M

$1.4

$4.4

Jun 2026

GLM-5.1

ReasoningTools / functions

Z.ai flagship (744B MoE, 40B activated). Post-training upgrade over GLM-5 with stronger coding and long-horizon task autonomy. 200K context, thinking mode.

205K

$1.4

$4.4

Apr 2026

GLM-5V Turbo

VisionReasoningTools / functions

First multimodal GLM-5 model. Vision-based coding agent with image/video/file inputs. 200K context, 128K output, thinking mode.

205K

$1.2

$4

Apr 2026

GLM-5 Turbo

ReasoningTools / functions

Speed-optimized GLM-5 variant for agent workflows. Enhanced tool invocation and long-chain execution. 200K context, thinking mode.

205K

$1.2

$4

Mar 2026

GLM-5

ReasoningTools / functions

Z.ai flagship foundation model (744B MoE, 40B activated). Designed for Agentic Engineering with SOTA coding and agent capabilities. 200K context, thinking mode.

205K

$1

$3.2

Feb 2026

GLM-OCR (Vision, OCR)

Vision

Specialized OCR model for text extraction from images and documents.

131K

$0.03

$0.03

Feb 2026

GLM-4.7 Flash (Free)

ReasoningTools / functions

Free GLM-4.7 variant. Same model as FlashX but with limited concurrency (1 concurrent request) and lower priority.

131K

-

-

Jan 2026

GLM-4.7 FlashX

ReasoningTools / functions

Fast GLM-4.7 variant with priority routing and higher concurrency. Same model as Flash, better infrastructure.

131K

$0.07

$0.4

Jan 2026

GLM-4.7

ReasoningTools / functions

Latest-gen GLM model with 128K context. Thinking mode activated by default.

131K

$0.6

$2.2

Dec 2025

AutoGLM Phone

Vision

Mobile phone automation agent. Understands phone screens via multimodal perception and executes automated operations.

131K

-

-

Dec 2025

GLM-4.6 V

VisionReasoningTools / functions

Vision-enabled GLM-4.6 model. Supports image/video/file inputs, 32K output, hybrid thinking.

131K

$0.3

$0.9

Dec 2025

GLM-4.6 V Flash (Free)

VisionReasoningTools / functions

Free vision GLM-4.6. Same model as FlashX but with limited concurrency (1 concurrent request). Image/video/file inputs, 32K output.

131K

-

-

Dec 2025

GLM-4.6 V FlashX

VisionReasoningTools / functions

Fast vision GLM-4.6 with priority routing and higher concurrency. Image/video/file inputs, 32K output.

131K

$0.04

$0.4

Dec 2025

GLM-4.6

ReasoningTools / functions

GLM-4.6 model with 128K context/output. Hybrid thinking: auto-determines whether to engage deep reasoning.

131K

$0.6

$2.2

Sep 2025

GLM-4.5 V

VisionReasoningTools / functions

Vision-enabled GLM-4.5 model. 96K context, 16K output, interleaved thinking.

98K

$0.6

$1.8

Aug 2025

GLM-4.5 Air

ReasoningTools / functions

Lightweight GLM-4.5 variant. Interleaved thinking.

98K

$0.2

$1.1

Jul 2025

GLM-4.5

ReasoningTools / functions

Prior-gen GLM-4.5 model with 96K context/output. Interleaved thinking.

98K

$0.6

$2.2

Jul 2025

GLM-4.5 X

ReasoningTools / functions

Extended GLM-4.5 model. Interleaved thinking.

98K

$2.2

$8.9

Jul 2025

GLM-4.5 AirX

ReasoningTools / functions

Extended lightweight GLM-4.5 variant. Interleaved thinking.

98K

$1.1

$4.5

Jul 2025

GLM-4.5 Flash (Free)

ReasoningTools / functions

Free GLM-4.5 variant with limited concurrency. Prior-gen, superseded by GLM-4.7 Flash.

98K

-

-

Jul 2025

GLM-4 32B (0414) 128K

Tools / functions

GLM-4 32B model with 128K context, 16K output.

131K

$0.1

$0.1

Apr 2025

GLM-5.2 (1M)

NEW
Jun 2026

Z.ai 1M-context flagship (744B MoE, 40B activated). Agentic coding with reasoning_effort control (high/max). 1M context, 128K output.

ReasoningTools / functions
1M · in $1.4 · out $4.4

GLM-5.1

Apr 2026

Z.ai flagship (744B MoE, 40B activated). Post-training upgrade over GLM-5 with stronger coding and long-horizon task autonomy. 200K context, thinking mode.

ReasoningTools / functions
205K · in $1.4 · out $4.4

GLM-5V Turbo

Apr 2026

First multimodal GLM-5 model. Vision-based coding agent with image/video/file inputs. 200K context, 128K output, thinking mode.

VisionReasoningTools / functions
205K · in $1.2 · out $4

GLM-5 Turbo

Mar 2026

Speed-optimized GLM-5 variant for agent workflows. Enhanced tool invocation and long-chain execution. 200K context, thinking mode.

ReasoningTools / functions
205K · in $1.2 · out $4

GLM-5

Feb 2026

Z.ai flagship foundation model (744B MoE, 40B activated). Designed for Agentic Engineering with SOTA coding and agent capabilities. 200K context, thinking mode.

ReasoningTools / functions
205K · in $1 · out $3.2

GLM-OCR (Vision, OCR)

Feb 2026

Specialized OCR model for text extraction from images and documents.

Vision
131K · in $0.03 · out $0.03

GLM-4.7 Flash (Free)

Jan 2026

Free GLM-4.7 variant. Same model as FlashX but with limited concurrency (1 concurrent request) and lower priority.

ReasoningTools / functions
131K · in - · out -

GLM-4.7 FlashX

Jan 2026

Fast GLM-4.7 variant with priority routing and higher concurrency. Same model as Flash, better infrastructure.

ReasoningTools / functions
131K · in $0.07 · out $0.4

GLM-4.7

Dec 2025

Latest-gen GLM model with 128K context. Thinking mode activated by default.

ReasoningTools / functions
131K · in $0.6 · out $2.2

AutoGLM Phone

Dec 2025

Mobile phone automation agent. Understands phone screens via multimodal perception and executes automated operations.

Vision
131K · in - · out -

GLM-4.6 V

Dec 2025

Vision-enabled GLM-4.6 model. Supports image/video/file inputs, 32K output, hybrid thinking.

VisionReasoningTools / functions
131K · in $0.3 · out $0.9

GLM-4.6 V Flash (Free)

Dec 2025

Free vision GLM-4.6. Same model as FlashX but with limited concurrency (1 concurrent request). Image/video/file inputs, 32K output.

VisionReasoningTools / functions
131K · in - · out -

GLM-4.6 V FlashX

Dec 2025

Fast vision GLM-4.6 with priority routing and higher concurrency. Image/video/file inputs, 32K output.

VisionReasoningTools / functions
131K · in $0.04 · out $0.4

GLM-4.6

Sep 2025

GLM-4.6 model with 128K context/output. Hybrid thinking: auto-determines whether to engage deep reasoning.

ReasoningTools / functions
131K · in $0.6 · out $2.2

GLM-4.5 V

Aug 2025

Vision-enabled GLM-4.5 model. 96K context, 16K output, interleaved thinking.

VisionReasoningTools / functions
98K · in $0.6 · out $1.8

GLM-4.5 Air

Jul 2025

Lightweight GLM-4.5 variant. Interleaved thinking.

ReasoningTools / functions
98K · in $0.2 · out $1.1

GLM-4.5

Jul 2025

Prior-gen GLM-4.5 model with 96K context/output. Interleaved thinking.

ReasoningTools / functions
98K · in $0.6 · out $2.2

GLM-4.5 X

Jul 2025

Extended GLM-4.5 model. Interleaved thinking.

ReasoningTools / functions
98K · in $2.2 · out $8.9

GLM-4.5 AirX

Jul 2025

Extended lightweight GLM-4.5 variant. Interleaved thinking.

ReasoningTools / functions
98K · in $1.1 · out $4.5

GLM-4.5 Flash (Free)

Jul 2025

Free GLM-4.5 variant with limited concurrency. Prior-gen, superseded by GLM-4.7 Flash.

ReasoningTools / functions
98K · in - · out -

GLM-4 32B (0414) 128K

Apr 2025

GLM-4 32B model with 128K context, 16K output.

Tools / functions
131K · in $0.1 · out $0.1
21 models · sorted by release date · prices in USD per 1M tokens · refreshed every 30 minutesCompare every model across vendors →

Running Z.ai GLM in Big-AGI

Add your Z.ai API key and run the GLM models at Z.ai's own API rates. Big-AGI adds no markup and keeps your keys and chats in your browser, not on its servers.

  • Your key, your billing. Usage is billed by Z.ai to your account. Big-AGI does not meter or charge for model usage.
  • Direct Connection. Turn it on and the browser calls Z.ai directly, bypassing the Big-AGI server, when your key is client-side and Z.ai allows it.
  • Strong at code and agents. GLM is tuned for coding and tool use, a capable and cost-effective pick for daily work.
  • Beam. Run GLM in parallel with Claude, GPT, and Gemini, then compare or merge the answers. Parallel runs use more tokens than a single chat.

Bring your Z.ai key. Keep control.

Your key, your data, your choice of model. Big-AGI is open source and self-hostable, so you can check exactly how Z.ai is called.

© 2026 Token Fabrics·Built with passion in San Diego