Baseten Chat Completions
baseten_chat_completionsCreate a chat completion using OpenAI-compatible API.
How it works ↓Pricing
Per call
$0.01
Model
flat
Pay only for what you use. No subscriptions.
Inputs
top_logprobs
numberreasoning_effort
stringlogit_bias
objectseed
numberbad
stringskip_special_tokens
booleandocuments
stringpresence_penalty
numberecho
booleantop_p_min
numberearly_stopping
booleantools
stringlogprobs
booleantop_p
numberfrequency_penalty
numberresponse_format
objecttruncate_prompt_tokens
numberbest_of
numberstream
booleantop_k
numberdisaggregated_params
objecttemperature
numbertool_choice
stringmodel *
stringignore_eos
booleanchat_template
stringmax_tokens
numberadd_generation_prompt
booleann
numbermin_tokens
numbermin_p
numberspaces_between_special_tokens
booleanchat_template_args
objectstop
stringparallel_tool_calls
booleaninclude_stop_str_in_output
booleanmessages *
stringbad_token_ids
stringstream_options
objectuser
stringrepetition_penalty
numberlength_penalty
numberstop_token_ids
stringadd_special_tokens
booleanInput Parameters
Cost per run
Execution cost$0.01
About Baseten Chat Completions
Create a chat completion using OpenAI-compatible API.
Supported Models:
deepseek-ai/DeepSeek-V3-0324- DeepSeek V3 0324 (164k context) 🧠deepseek-ai/DeepSeek-V3.1- DeepSeek V3.1 (164k context) 🧠zai-org/GLM-4.6- GLM 4.6 (200k context) 🧠zai-org/GLM-4.7- GLM 4.7 (200k context) 🧠moonshotai/Kimi-K2-Instruct-0905- Kimi K2 0905 (128k context)moonshotai/Kimi-K2-Thinking- Kimi K2 Thinking (262k context) 🧠 always-onmoonshotai/Kimi-K2.5- Kimi K2.5 (262k context)openai/gpt-oss-120b- OpenAI GPT OSS 120B (128k context)
🧠 = Reasoning model. Use reasoning_effort param (low/medium/high) to control thinking depth. Response includes reasoning_content field with chain-of-thought.
Supports streaming, tool calling, structured outputs.
Frequently Asked Questions
Create a chat completion using OpenAI-compatible API. **Supported Models:** - `deepseek-ai/DeepSeek-V3-0324` - DeepSeek V3 0324 (164k context) 🧠 - `deepseek-ai/DeepSeek-V3.1` - DeepSeek V3.1 (164k context) 🧠 - `zai-org/GLM-4.6` - GLM 4.6 (200k context) 🧠 - `zai-org/GLM-4.7` - GLM 4.7 (200k context) 🧠 - `moonshotai/Kimi-K2-Instruct-0905` - Kimi K2 0905 (128k context) - `moonshotai/Kimi-K2-Thinking` - Kimi K2 Thinking (262k context) 🧠 always-on - `moonshotai/Kimi-K2.5` - Kimi K2.5 (262k context) - `openai/gpt-oss-120b` - OpenAI GPT OSS 120B (128k context) 🧠 = Reasoning model. Use `reasoning_effort` param (low/medium/high) to control thinking depth. Response includes `reasoning_content` field with chain-of-thought. Supports streaming, tool calling, structured outputs.
Baseten Chat Completions costs $0.01 per call on xpay. No subscription, no minimums. Pay only for the calls you make. New accounts get $5 in free credits.
Connect the Baseten Model APIs MCP endpoint to your client — Claude Code: claude mcp add --transport http baseten "https://baseten.mcp.xpay.sh/mcp?key=YOUR_XPAY_KEY"; Cursor/Windsurf/Cline/VS Code: same URL in mcp.json. The agent will see baseten_chat_completions as a callable tool with the input schema and run it directly. (Unified across all providers: https://mcp.xpay.sh/mcp?key=YOUR_XPAY_KEY, then xpay_run with toolPath baseten/baseten_chat_completions.)
Yes — that's exactly what xpay is for. You don't need a Baseten Model APIs account or API key. Sign up at xpay.tools (Google or email), get $5 free credit, and run Baseten Chat Completions immediately. Billing flows through your xpay balance.
Baseten Chat Completions accepts 44 input parameters: top_logprobs, reasoning_effort, logit_bias, seed, bad, skip_special_tokens…. See the input schema and runnable form on this page for details and to test live.

