AIX dispatch server - API features comparison

This is updated as of 2024-07-09, and includes the latest features and capabilities of the three major AI APIs: Anthropic, Gemini, and OpenAI. The comparison covers a wide range of features, including function calling, vision, system instructions, etc.

Feature Category	Specific Feature	Anthropic	Gemini	OpenAI
Message Structure
	Role types	user, assistant	user, model	user, assistant, system, tool
	Named participants	No	No	Yes
	Content array	Yes	Yes	Yes
Content Types and Multimodal Support
	Text generation	Yes	Yes	Yes
	Image understanding	Yes	Yes	Yes
	Audio processing	No	Yes	No
	Video processing	No	Yes	No
Image Handling
	Supported formats	JPEG, PNG, GIF, WebP	JPEG, PNG, WebP, HEIC, HEIF	PNG, JPEG, WebP, non-animated GIF
	Max image size	5MB per image	(20MB per prompt)	20MB per image
	Image detail level	N/A	N/A	Low, high, auto
	Image resolution	max: 1568x1568	min: 768x768, max: 3072x3072	min: 512x512, max: 2048 x 2048
	Token calculation for images	(width * height)/750; max 1,600	258 tokens	85 + 170 * {patches}
	Image retention	Deleted after processing	Not specified	Deleted after processing
Audio and Video Handling
	Audio formats	N/A	WAV, MP3, AIFF, AAC, OGG, FLAC	N/A
	Video formats	N/A	MP4, MPEG, MOV, AVI, MPG, WebM, WMV, 3GPP	N/A
System Instructions and Tool Use
	System instructions	Yes (array of text blocks)	Yes (parts array)	Yes (as system message)
Function/Tool Handling
	Parallel tool calls	No	No	Yes
	Tool Declaration	Defined in `tools` array	Defined in `tools` array	Defined in `tools` array
	FC name restrictions	Yes	Yes (max 63 chars)	Yes (max 64 chars)
	FC declaration	name, description, input_schema	name, description, parameters	name, description, parameters
	FC options structure	JSON Schema for input	Object with properties	JSON Schema for parameters
	FC Force invocation	Via `tool_choice` parameter	Via `toolConfig` parameter	Via `tool_choice` parameter
	FC Model invocation	Model generates a `tool_use` block with predicted parameters	Generates a `functionCall` part with predicted parameters	Generates a message.`tool_calls` item with predicted arguments
	FC Execution	Client-side	Client-side	Client-side
	FC Result injection	Client appends a `user` message with a `tool_result` content block	Client appends a `function` message with `functionResponse` part	Client sends a new `tool` message with `tool_call_id` and `content`
	Built-in Code execution	No	Yes	No
	Tool use with vision	Yes	Yes	Yes
Generation Configuration
	temperature	Yes	Yes	Yes
	max_tokens	Yes	Yes	Yes
	stop_sequences	Yes	Yes	Yes
	top_k	Yes	Yes	No
	top_p	Yes	Yes	Yes
	seed	No	No	Yes
	Multiple candidates	No	No	Yes (with 'n' parameter, breaks streaming?)
Streaming and Response Structure
	Streaming support	Yes	Yes	Yes
	Streaming initiation	stream=true	streamGenerateContent path	stream=true
	Streaming event types	Multiple specific types	Not specified	Single delta type
	Response container	content (array)	candidates (array)	choices (array)
Usage Metrics and Error Handling
	Token counts	Yes	Yes	Yes
	Detailed token breakdown	input, output	prompt, cached, candidates, total	prompt, completion, total
	Usage in stream	No	No	Optional
	Error handling in response	Not specified	Not specified	Yes (undocumented)
	Error handling in stream	Not specified	Not specified	Yes (undocumented)
Advanced Features
	JSON mode	Partial (via structured prompts)	Yes (responseMimeType)	Yes
	Output consistency techniques	Yes (multiple methods)	Not specified	Not specified
	Logprobs	No	No	Yes (disabled in schema)
	System fingerprint	No	No	Yes
	Semantic caching	No	Yes	No
	Assistant prefill	Yes	No	No
	Preferred formatting	XML tags, JSON	Not specified	Markdown
Safety and Compliance
	Safety settings in request	Stop sequences	Detailed category-based	Moderation API
	Safety feedback in response	Yes	Yes	Not specified