LLM API
AI language models (OpenAI-compatible). Reasoning-capable models may return extended metadata in model_output_metadata.reasoning containing inference traces, without changing the request format.
Endpoints
| Method | Path | Description |
|---|---|---|
GET | /v1/llm/models | List models |
GET | /v1/models | List models |
POST | /v1/llm/chat | Chat completion |
POST | /v1/llm/chat/completions | Chat completion |
POST | /v1/chat/completions | Chat completion |
GET /v1/llm/models
List models
List available models (WAYSCloud endpoint)
Response:
| Field | Type | Description |
|---|---|---|
object | string | Values: list |
data | array |
Example:
curl https://api.wayscloud.services/v1/llm/models \
-H "X-API-Key: YOUR_API_KEY"GET /v1/models
List models
List available models (OpenAI-compatible endpoint)
Returns a list of all available LLM models.
Response Example:
{
"object": "list",
"data": [
{"id": "mixtral-8x7b", "object": "model", "owned_by": "wayscloud"},
{"id": "qwen3-80b-instruct", "object": "model", "owned_by": "wayscloud"}
]
}Response:
| Field | Type | Description |
|---|---|---|
object | string | Values: list |
data | array |
Example:
curl https://api.wayscloud.services/v1/models \
-H "X-API-Key: YOUR_API_KEY"POST /v1/llm/chat
Chat completion
WAYSCloud LLM chat completion endpoint
Supports both streaming and non-streaming responses.
Request Body:
| Field | Type | Description |
|---|---|---|
model | string | Required. Model alias (e.g., 'mixtral-8x7b') |
messages | array | Required. |
stream | boolean | Enable SSE streaming |
temperature | object | |
max_tokens | object | |
top_p | object | |
tools | object | Tool definitions for function calling. Accepted for OpenAI compatibility, not yet implemented. |
tool_choice | object | Controls tool selection. Accepted for OpenAI compatibility, not yet implemented. |
agent_id | object | Agent identifier for AI agents. Stored for logging only. |
region | object | Preferred datacenter region for inference. Currently only 'oslo' is available. |
Response:
Example:
curl -X POST https://api.wayscloud.services/v1/llm/chat \
-H "X-API-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{...}'POST /v1/llm/chat/completions
Chat completion
WAYSCloud LLM chat completion endpoint (with /completions suffix)
Alias for /v1/llm/chat for nginx compatibility. Supports both streaming and non-streaming responses.
Request Body:
| Field | Type | Description |
|---|---|---|
model | string | Required. Model alias (e.g., 'mixtral-8x7b') |
messages | array | Required. |
stream | boolean | Enable SSE streaming |
temperature | object | |
max_tokens | object | |
top_p | object | |
tools | object | Tool definitions for function calling. Accepted for OpenAI compatibility, not yet implemented. |
tool_choice | object | Controls tool selection. Accepted for OpenAI compatibility, not yet implemented. |
agent_id | object | Agent identifier for AI agents. Stored for logging only. |
region | object | Preferred datacenter region for inference. Currently only 'oslo' is available. |
Response:
Example:
curl -X POST https://api.wayscloud.services/v1/llm/chat/completions \
-H "X-API-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{...}'POST /v1/chat/completions
Chat completion
OpenAI-compatible chat completion endpoint
Drop-in replacement for OpenAI's /v1/chat/completions endpoint. Compatible with OpenAI SDK and all OpenAI-compatible clients.
Features:
- Non-streaming and streaming (SSE) responses
- Temperature and max_tokens control
- Agents framework support (tools, agent_id, tool_choice)
- Automatic token counting and billing
Request Example:
{
"model": "mixtral-8x7b",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
],
"temperature": 0.7,
"max_tokens": 100
}Streaming Example:
{
"model": "qwen3-80b-instruct",
"messages": [{"role": "user", "content": "Write a story"}],
"stream": true,
"max_tokens": 500
}AI Agent Example:
{
"model": "deepseek-v3",
"messages": [{"role": "user", "content": "Help me code"}],
"agent_id": "ephemeral",
"tools": [],
"tool_choice": "auto"
}Available Models:
mixtral-8x7b- Fast, multilingual (good for Norwegian)qwen3-80b-instruct- Balanced performanceqwen3-80b-thinking- Reasoning capabilitiesdeepseek-v3- High quality, codingdeepseek-r1- Advanced reasoningllama-3.1-405b- Largest model
Request Body:
| Field | Type | Description |
|---|---|---|
model | string | Required. Model alias (e.g., 'mixtral-8x7b') |
messages | array | Required. |
stream | boolean | Enable SSE streaming |
temperature | object | |
max_tokens | object | |
top_p | object | |
tools | object | Tool definitions for function calling. Accepted for OpenAI compatibility, not yet implemented. |
tool_choice | object | Controls tool selection. Accepted for OpenAI compatibility, not yet implemented. |
agent_id | object | Agent identifier for AI agents. Stored for logging only. |
region | object | Preferred datacenter region for inference. Currently only 'oslo' is available. |
Response:
Example:
curl -X POST https://api.wayscloud.services/v1/chat/completions \
-H "X-API-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{...}'