Baseten Model APIs
OpenAI-compatible inference API for high-performance LLMs. Drop-in replacement for OpenAI SDK - just change base_url and api_key. **Supported Models:** | Model | Slug | Context | |-------|------|--------| | DeepSeek V3 0324 | `deepseek-ai/DeepSeek-V3-0324` | 164k | | DeepSeek V3.1 | `deepseek-ai/DeepSeek-V3.1` | 164k | | GLM 4.6 (Zhipu) | `zai-org/GLM-4.6` | 200k | | GLM 4.7 (Zhipu) | `zai-org/GLM-4.7` | 200k | | Kimi K2 0905 | `moonshotai/Kimi-K2-Instruct-0905` | 128k | | Kimi K2 Thinking | `moonshotai/Kimi-K2-Thinking` | 262k | | Kimi K2.5 | `moonshotai/Kimi-K2.5` | 262k | | OpenAI GPT OSS 120B | `openai/gpt-oss-120b` | 128k | **Features:** Chat completions, streaming, tool calling, structured outputs, reasoning modes. **Pricing:** ~$0.60/1M tokens (varies by model)
Baseten Model APIs
OpenAI-compatible inference API for high-performance LLMs. Drop-in replacement for OpenAI SDK - just change base_url and api_key.
Supported Models:
| Model | Slug | Context |
|---|---|---|
| DeepSeek V3 0324 | deepseek-ai/DeepSeek-V3-0324 | 164k |
| DeepSeek V3.1 | deepseek-ai/DeepSeek-V3.1 | 164k |
| GLM 4.6 (Zhipu) | zai-org/GLM-4.6 | 200k |
| GLM 4.7 (Zhipu) | zai-org/GLM-4.7 | 200k |
| Kimi K2 0905 | moonshotai/Kimi-K2-Instruct-0905 | 128k |
| Kimi K2 Thinking | moonshotai/Kimi-K2-Thinking | 262k |
| Kimi K2.5 | moonshotai/Kimi-K2.5 | 262k |
| OpenAI GPT OSS 120B | openai/gpt-oss-120b | 128k |
Features: Chat completions, streaming, tool calling, structured outputs, reasoning modes.
Pricing: ~$0.60/1M tokens (varies by model)
Available Tools
- baseten_chat_completions: Create a chat completion using OpenAI-compatible API.
Supported Models:
deepseek-ai/DeepSeek-V3-0324- DeepSeek V3 0324 (164k context) 🧠deepseek-ai/DeepSeek-V3.1- DeepSeek V3.1 (164k context) 🧠zai-org/GLM-4.6- GLM 4.6 (200k context) 🧠zai-org/GLM-4.7- GLM 4.7 (200k context) 🧠moonshotai/Kimi-K2-Instruct-0905- Kimi K2 0905 (128k context)moonshotai/Kimi-K2-Thinking- Kimi K2 Thinking (262k context) 🧠 always-onmoonshotai/Kimi-K2.5- Kimi K2.5 (262k context)openai/gpt-oss-120b- OpenAI GPT OSS 120B (128k context)
🧠 = Reasoning model. Use reasoning_effort param (low/medium/high) to control thinking depth. Response includes reasoning_content field with chain-of-thought.
Supports streaming, tool calling, structured outputs. — $0.01/call
Quick Start
{
"mcpServers": {
"baseten": {
"url": "https://baseten.mcp.xpay.sh/mcp?key=YOUR_XPAY_KEY"
}
}
}
Pricing
Pay per tool call from your XPay wallet. No subscriptions, no minimums.
Get an API key at xpay.tools.
Tools: 1 Category: General
When to Use
Use Baseten Model APIs tools when you need to openai-compatible inference api for high-performance llms. drop-in replacement for openai sdk - just change base_url and api_key.
supported models:
| model | slug | context |
|---|---|---|
| deepseek v3 0324 | deepseek-ai/deepseek-v3-0324 | 164k |
| deepseek v3.1 | deepseek-ai/deepseek-v3.1 | 164k |
| glm 4.6 (zhipu) | zai-org/glm-4.6 | 200k |
| glm 4.7 (zhipu) | zai-org/glm-4.7 | 200k |
| kimi k2 0905 | moonshotai/kimi-k2-instruct-0905 | 128k |
| kimi k2 thinking | moonshotai/kimi-k2-thinking | 262k |
| kimi k2.5 | moonshotai/kimi-k2.5 | 262k |
| openai gpt oss 120b | openai/gpt-oss-120b | 128k |
features: chat completions, streaming, tool calling, structured outputs, reasoning modes.
pricing: ~$0.60/1m tokens (varies by model). All tools are available through xpay✦'s single MCP connection.
MCP Connection
{
"mcpServers": {
"xpay": {
"url": "https://mcp.xpay.sh/mcp?key=YOUR_API_KEY"
}
}
}
For Claude Code:
claude mcp add --transport http xpay "https://mcp.xpay.sh/mcp?key=YOUR_API_KEY"
Available Tools
- Baseten Chat Completions — Create a chat completion using OpenAI-compatible API.
Supported Models:
deepseek-ai/DeepSeek-V3-0324- DeepSeek V3 0324 (164k context) 🧠deepseek-ai/DeepSeek-V3.1- DeepSeek V3.1 (164k context) 🧠zai-org/GLM-4.6- GLM 4.6 (200k context) 🧠zai-org/GLM-4.7- GLM 4.7 (200k context) 🧠moonshotai/Kimi-K2-Instruct-0905- Kimi K2 0905 (128k context)moonshotai/Kimi-K2-Thinking- Kimi K2 Thinking (262k context) 🧠 always-onmoonshotai/Kimi-K2.5- Kimi K2.5 (262k context)openai/gpt-oss-120b- OpenAI GPT OSS 120B (128k context)
🧠 = Reasoning model. Use reasoning_effort param (low/medium/high) to control thinking depth. Response includes reasoning_content field with chain-of-thought.
Supports streaming, tool calling, structured outputs. — $0.01/call — SKILL.md
How to Execute
xpay_discover— Search for tools:xpay_discover("baseten")xpay_details— Get input schema:xpay_details("baseten/TOOL_NAME")xpay_run— Execute:xpay_run("baseten/TOOL_NAME", { ...inputs })xpay_balance— Check credits
Links
- Provider page: https://xpay.tools/baseten/
- All providers: https://xpay.tools/explore
- Docs: https://docs.xpay.sh
Tools (1)
Install Skill
Details
Tools
1
Category
General
Total calls
0

