LLM API

AI language models (OpenAI-compatible). Reasoning-capable models may return extended metadata in model_output_metadata.reasoning containing inference traces, without changing the request format.

Endpoints

Method	Path	Description
`GET`	`/v1/llm/models`	List models
`GET`	`/v1/models`	List models
`POST`	`/v1/llm/chat`	Chat completion
`POST`	`/v1/llm/chat/completions`	Chat completion
`POST`	`/v1/chat/completions`	Chat completion

GET /v1/llm/models

List models

List available models (WAYSCloud endpoint)

Response:

Field	Type	Description
`object`	`string`	Values: `list`
`data`	`array`

Example:

bash

curl https://api.wayscloud.services/v1/llm/models \
  -H "X-API-Key: YOUR_API_KEY"

GET /v1/models

List models

List available models (OpenAI-compatible endpoint)

Returns a list of all available LLM models.

Response Example:

json

{
  "object": "list",
  "data": [
    {"id": "mixtral-8x7b", "object": "model", "owned_by": "wayscloud"},
    {"id": "qwen3-80b-instruct", "object": "model", "owned_by": "wayscloud"}
  ]
}

Response:

Field	Type	Description
`object`	`string`	Values: `list`
`data`	`array`

Example:

bash

curl https://api.wayscloud.services/v1/models \
  -H "X-API-Key: YOUR_API_KEY"

POST /v1/llm/chat

Chat completion

WAYSCloud LLM chat completion endpoint

Supports both streaming and non-streaming responses.

Request Body:

Field	Type	Description
`model`	`string`	Required. Model alias (e.g., 'mixtral-8x7b')
`messages`	`array`	Required.
`stream`	`boolean`	Enable SSE streaming
`temperature`	`object`
`max_tokens`	`object`
`top_p`	`object`
`tools`	`object`	Tool definitions for function calling. Accepted for OpenAI compatibility, not yet implemented.
`tool_choice`	`object`	Controls tool selection. Accepted for OpenAI compatibility, not yet implemented.
`agent_id`	`object`	Agent identifier for AI agents. Stored for logging only.
`region`	`object`	Preferred datacenter region for inference. Currently only 'oslo' is available.

Response:

Example:

bash

curl -X POST https://api.wayscloud.services/v1/llm/chat \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{...}'

POST /v1/llm/chat/completions

Chat completion

WAYSCloud LLM chat completion endpoint (with /completions suffix)

Alias for /v1/llm/chat for nginx compatibility. Supports both streaming and non-streaming responses.

Request Body:

Field	Type	Description
`model`	`string`	Required. Model alias (e.g., 'mixtral-8x7b')
`messages`	`array`	Required.
`stream`	`boolean`	Enable SSE streaming
`temperature`	`object`
`max_tokens`	`object`
`top_p`	`object`
`tools`	`object`	Tool definitions for function calling. Accepted for OpenAI compatibility, not yet implemented.
`tool_choice`	`object`	Controls tool selection. Accepted for OpenAI compatibility, not yet implemented.
`agent_id`	`object`	Agent identifier for AI agents. Stored for logging only.
`region`	`object`	Preferred datacenter region for inference. Currently only 'oslo' is available.

Response:

Example:

bash

curl -X POST https://api.wayscloud.services/v1/llm/chat/completions \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{...}'

POST /v1/chat/completions

Chat completion

OpenAI-compatible chat completion endpoint

Drop-in replacement for OpenAI's /v1/chat/completions endpoint. Compatible with OpenAI SDK and all OpenAI-compatible clients.

Features:

Non-streaming and streaming (SSE) responses
Temperature and max_tokens control
Agents framework support (tools, agent_id, tool_choice)
Automatic token counting and billing

Request Example:

json

{
  "model": "mixtral-8x7b",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ],
  "temperature": 0.7,
  "max_tokens": 100
}

Streaming Example:

json

{
  "model": "qwen3-80b-instruct",
  "messages": [{"role": "user", "content": "Write a story"}],
  "stream": true,
  "max_tokens": 500
}

AI Agent Example:

json

{
  "model": "deepseek-v3",
  "messages": [{"role": "user", "content": "Help me code"}],
  "agent_id": "ephemeral",
  "tools": [],
  "tool_choice": "auto"
}

Available Models:

mixtral-8x7b - Fast, multilingual (good for Norwegian)
qwen3-80b-instruct - Balanced performance
qwen3-80b-thinking - Reasoning capabilities
deepseek-v3 - High quality, coding
deepseek-r1 - Advanced reasoning
llama-3.1-405b - Largest model

Request Body:

Field	Type	Description
`model`	`string`	Required. Model alias (e.g., 'mixtral-8x7b')
`messages`	`array`	Required.
`stream`	`boolean`	Enable SSE streaming
`temperature`	`object`
`max_tokens`	`object`
`top_p`	`object`
`tools`	`object`	Tool definitions for function calling. Accepted for OpenAI compatibility, not yet implemented.
`tool_choice`	`object`	Controls tool selection. Accepted for OpenAI compatibility, not yet implemented.
`agent_id`	`object`	Agent identifier for AI agents. Stored for logging only.
`region`	`object`	Preferred datacenter region for inference. Currently only 'oslo' is available.

Response:

Example:

bash

curl -X POST https://api.wayscloud.services/v1/chat/completions \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{...}'

LLM API ​

Endpoints ​

GET /v1/llm/models ​

GET /v1/models ​

POST /v1/llm/chat ​

POST /v1/llm/chat/completions ​

POST /v1/chat/completions ​

LLM API

Endpoints

GET /v1/llm/models

GET /v1/models

POST /v1/llm/chat

POST /v1/llm/chat/completions

POST /v1/chat/completions