Pricing

Pay per token. No seats, no tiers, no minimum spend.

Model	Context	Input /1M tokens	Output /1M tokens	Cached input /1M
Qwen3 Coder 30B A3B (AWQ) ps/qwen3-coder-30b-a3b	41K	$0.300	$1.20	—
Qwen3 30B A3B (AWQ) ps/qwen3-30b-a3b	41K	$0.0900	$0.450	—
Qwen3.5 35B A3B (AWQ) ps/qwen3.5-35b-a3b	41K	$0.140	$1.00	—
GLM 4.5 Air (AWQ) ps/glm-4.5-air	131K	$0.125	$0.850	—
MiniMax M2.7 (AWQ) ps/minimax-m2.7	197K	$0.255	$1.00	—
GPT-OSS 120B (MXFP4) ps/gpt-oss-120b	66K	$0.0390	$0.180	—

Prices are in USD. The cached rate applies when prompt caching is active. A dash (—) means caching is not available for that model.

Billing details

How is billing calculated?: You are billed in USD per request, based on the number of tokens in the prompt and the completion. There are no per-seat charges and no platform fee. A $4/month subscription is available for higher message limits.
Is there a monthly minimum?: No. You only pay for what you use. Accounts with a zero balance can still make requests as long as credits remain.
Do credits expire?: Credits never expire. Any balance you add to your account stays there until you use it.