Skip to content

LLM API

AI language models (OpenAI-compatible). Reasoning-capable models may return extended metadata in model_output_metadata.reasoning containing inference traces, without changing the request format.

Endpoints

MethodPathDescription
GET/v1/llm/modelsList models
GET/v1/modelsList models
POST/v1/llm/chatChat completion
POST/v1/llm/chat/completionsChat completion
POST/v1/chat/completionsChat completion

GET /v1/llm/models

List models

List available models (WAYSCloud endpoint)

Response:

FieldTypeDescription
objectstringValues: list
dataarray

Example:

bash
curl https://api.wayscloud.services/v1/llm/models \
  -H "X-API-Key: YOUR_API_KEY"

GET /v1/models

List models

List available models (OpenAI-compatible endpoint)

Returns a list of all available LLM models.

Response Example:

json
{
  "object": "list",
  "data": [
    {"id": "mixtral-8x7b", "object": "model", "owned_by": "wayscloud"},
    {"id": "qwen3-80b-instruct", "object": "model", "owned_by": "wayscloud"}
  ]
}

Response:

FieldTypeDescription
objectstringValues: list
dataarray

Example:

bash
curl https://api.wayscloud.services/v1/models \
  -H "X-API-Key: YOUR_API_KEY"

POST /v1/llm/chat

Chat completion

WAYSCloud LLM chat completion endpoint

Supports both streaming and non-streaming responses.

Request Body:

FieldTypeDescription
modelstringRequired. Model alias (e.g., 'mixtral-8x7b')
messagesarrayRequired.
streambooleanEnable SSE streaming
temperatureobject
max_tokensobject
top_pobject
toolsobjectTool definitions for function calling. Accepted for OpenAI compatibility, not yet implemented.
tool_choiceobjectControls tool selection. Accepted for OpenAI compatibility, not yet implemented.
agent_idobjectAgent identifier for AI agents. Stored for logging only.
regionobjectPreferred datacenter region for inference. Currently only 'oslo' is available.

Response:

Example:

bash
curl -X POST https://api.wayscloud.services/v1/llm/chat \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{...}'

POST /v1/llm/chat/completions

Chat completion

WAYSCloud LLM chat completion endpoint (with /completions suffix)

Alias for /v1/llm/chat for nginx compatibility. Supports both streaming and non-streaming responses.

Request Body:

FieldTypeDescription
modelstringRequired. Model alias (e.g., 'mixtral-8x7b')
messagesarrayRequired.
streambooleanEnable SSE streaming
temperatureobject
max_tokensobject
top_pobject
toolsobjectTool definitions for function calling. Accepted for OpenAI compatibility, not yet implemented.
tool_choiceobjectControls tool selection. Accepted for OpenAI compatibility, not yet implemented.
agent_idobjectAgent identifier for AI agents. Stored for logging only.
regionobjectPreferred datacenter region for inference. Currently only 'oslo' is available.

Response:

Example:

bash
curl -X POST https://api.wayscloud.services/v1/llm/chat/completions \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{...}'

POST /v1/chat/completions

Chat completion

OpenAI-compatible chat completion endpoint

Drop-in replacement for OpenAI's /v1/chat/completions endpoint. Compatible with OpenAI SDK and all OpenAI-compatible clients.

Features:

  • Non-streaming and streaming (SSE) responses
  • Temperature and max_tokens control
  • Agents framework support (tools, agent_id, tool_choice)
  • Automatic token counting and billing

Request Example:

json
{
  "model": "mixtral-8x7b",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ],
  "temperature": 0.7,
  "max_tokens": 100
}

Streaming Example:

json
{
  "model": "qwen3-80b-instruct",
  "messages": [{"role": "user", "content": "Write a story"}],
  "stream": true,
  "max_tokens": 500
}

AI Agent Example:

json
{
  "model": "deepseek-v3",
  "messages": [{"role": "user", "content": "Help me code"}],
  "agent_id": "ephemeral",
  "tools": [],
  "tool_choice": "auto"
}

Available Models:

  • mixtral-8x7b - Fast, multilingual (good for Norwegian)
  • qwen3-80b-instruct - Balanced performance
  • qwen3-80b-thinking - Reasoning capabilities
  • deepseek-v3 - High quality, coding
  • deepseek-r1 - Advanced reasoning
  • llama-3.1-405b - Largest model

Request Body:

FieldTypeDescription
modelstringRequired. Model alias (e.g., 'mixtral-8x7b')
messagesarrayRequired.
streambooleanEnable SSE streaming
temperatureobject
max_tokensobject
top_pobject
toolsobjectTool definitions for function calling. Accepted for OpenAI compatibility, not yet implemented.
tool_choiceobjectControls tool selection. Accepted for OpenAI compatibility, not yet implemented.
agent_idobjectAgent identifier for AI agents. Stored for logging only.
regionobjectPreferred datacenter region for inference. Currently only 'oslo' is available.

Response:

Example:

bash
curl -X POST https://api.wayscloud.services/v1/chat/completions \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{...}'

WAYSCloud AS