Skip to content

LLM API

AI language models (OpenAI-compatible). Reasoning-capable models may return extended metadata in model_output_metadata.reasoning containing inference traces, without changing the request format.

Base URL: https://api.wayscloud.services

Endpoints

MethodPathDescription
GET/v1/llm/modelsList models
GET/v1/modelsList models
POST/v1/llm/chatChat completion
POST/v1/llm/chat/completionsChat completion
POST/v1/chat/completionsChat completion

GET /v1/llm/models

List models

List available models (WAYSCloud endpoint)

Response:

FieldTypeDescription
objectstringValues: list
dataarray

Example:

bash
curl https://api.wayscloud.services/v1/llm/models \
  -H "X-API-Key: wayscloud_llm_abc12_YOUR_SECRET"

Response:

json
{
  "object": "list",
  "data": [
    {
      "id": "mixtral-8x7b",
      "object": "model",
      "owned_by": "wayscloud",
      "created": 1700000000
    },
    {
      "id": "qwen3-80b-instruct",
      "object": "model",
      "owned_by": "wayscloud",
      "created": 1700000000
    },
    {
      "id": "deepseek-v3",
      "object": "model",
      "owned_by": "wayscloud",
      "created": 1700000000
    }
  ]
}

GET /v1/models

List models

List available models (OpenAI-compatible endpoint)

Returns a list of all available LLM models.

Response Example:

json
{
  "object": "list",
  "data": [
    {"id": "mixtral-8x7b", "object": "model", "owned_by": "wayscloud"},
    {"id": "qwen3-80b-instruct", "object": "model", "owned_by": "wayscloud"}
  ]
}

Response:

FieldTypeDescription
objectstringValues: list
dataarray

Example:

bash
curl https://api.wayscloud.services/v1/models \
  -H "X-API-Key: wayscloud_llm_abc12_YOUR_SECRET"

Response:

json
{
  "object": "list",
  "data": [
    {
      "id": "mixtral-8x7b",
      "object": "model",
      "owned_by": "wayscloud",
      "created": 1700000000
    },
    {
      "id": "qwen3-80b-instruct",
      "object": "model",
      "owned_by": "wayscloud",
      "created": 1700000000
    },
    {
      "id": "deepseek-v3",
      "object": "model",
      "owned_by": "wayscloud",
      "created": 1700000000
    }
  ]
}

POST /v1/llm/chat

Chat completion

WAYSCloud LLM chat completion endpoint

Supports both streaming and non-streaming responses.

Request Body:

FieldTypeDescription
modelstringRequired. Model alias (e.g., 'mixtral-8x7b')
messagesarrayRequired.
streambooleanEnable SSE streaming
temperatureobject
max_tokensobject
top_pobject
toolsobjectTool definitions for function calling. Accepted for OpenAI compatibility, not yet implemented.
tool_choiceobjectControls tool selection. Accepted for OpenAI compatibility, not yet implemented.
agent_idobjectAgent identifier for AI agents. Stored for logging only.
regionobjectPreferred datacenter region for inference. Currently only 'oslo' is available.

Response example:

json
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1700000000,
  "model": "qwen3-235b-thinking",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of Norway is Oslo."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 24,
    "completion_tokens": 8,
    "total_tokens": 32
  }
}

Example:

bash
curl -X POST https://api.wayscloud.services/v1/llm/chat \
  -H "X-API-Key: wayscloud_llm_abc12_YOUR_SECRET" \
  -H "Content-Type: application/json" \
  -d '{
  "model": "qwen3-235b-thinking",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "What is the capital of Norway?"
    }
  ],
  "temperature": 0.7,
  "max_tokens": 256
}'

Response:

json
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1700000000,
  "model": "qwen3-235b-thinking",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of Norway is Oslo."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 24,
    "completion_tokens": 8,
    "total_tokens": 32
  }
}

POST /v1/llm/chat/completions

Chat completion

WAYSCloud LLM chat completion endpoint (with /completions suffix)

Alias for /v1/llm/chat for nginx compatibility. Supports both streaming and non-streaming responses.

Request Body:

FieldTypeDescription
modelstringRequired. Model alias (e.g., 'mixtral-8x7b')
messagesarrayRequired.
streambooleanEnable SSE streaming
temperatureobject
max_tokensobject
top_pobject
toolsobjectTool definitions for function calling. Accepted for OpenAI compatibility, not yet implemented.
tool_choiceobjectControls tool selection. Accepted for OpenAI compatibility, not yet implemented.
agent_idobjectAgent identifier for AI agents. Stored for logging only.
regionobjectPreferred datacenter region for inference. Currently only 'oslo' is available.

Response example:

json
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1700000000,
  "model": "qwen3-235b-thinking",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of Norway is Oslo."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 24,
    "completion_tokens": 8,
    "total_tokens": 32
  }
}

Example:

bash
curl -X POST https://api.wayscloud.services/v1/llm/chat/completions \
  -H "X-API-Key: wayscloud_llm_abc12_YOUR_SECRET" \
  -H "Content-Type: application/json" \
  -d '{
  "model": "qwen3-235b-thinking",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "What is the capital of Norway?"
    }
  ],
  "temperature": 0.7,
  "max_tokens": 256
}'

Response:

json
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1700000000,
  "model": "qwen3-235b-thinking",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of Norway is Oslo."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 24,
    "completion_tokens": 8,
    "total_tokens": 32
  }
}

POST /v1/chat/completions

Chat completion

OpenAI-compatible chat completion endpoint

Drop-in replacement for OpenAI's /v1/chat/completions endpoint. Compatible with OpenAI SDK and all OpenAI-compatible clients.

Features:

  • Non-streaming and streaming (SSE) responses
  • Temperature and max_tokens control
  • Agents framework support (tools, agent_id, tool_choice)
  • Automatic token counting and billing

Request Example:

json
{
  "model": "mixtral-8x7b",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ],
  "temperature": 0.7,
  "max_tokens": 100
}

Streaming Example:

json
{
  "model": "qwen3-80b-instruct",
  "messages": [{"role": "user", "content": "Write a story"}],
  "stream": true,
  "max_tokens": 500
}

AI Agent Example:

json
{
  "model": "deepseek-v3",
  "messages": [{"role": "user", "content": "Help me code"}],
  "agent_id": "ephemeral",
  "tools": [],
  "tool_choice": "auto"
}

Available Models:

  • mixtral-8x7b - Fast, multilingual (good for Norwegian)
  • qwen3-80b-instruct - Balanced performance
  • qwen3-80b-thinking - Reasoning capabilities
  • deepseek-v3 - High quality, coding
  • deepseek-r1 - Advanced reasoning
  • llama-3.1-405b - Largest model

Request Body:

FieldTypeDescription
modelstringRequired. Model alias (e.g., 'mixtral-8x7b')
messagesarrayRequired.
streambooleanEnable SSE streaming
temperatureobject
max_tokensobject
top_pobject
toolsobjectTool definitions for function calling. Accepted for OpenAI compatibility, not yet implemented.
tool_choiceobjectControls tool selection. Accepted for OpenAI compatibility, not yet implemented.
agent_idobjectAgent identifier for AI agents. Stored for logging only.
regionobjectPreferred datacenter region for inference. Currently only 'oslo' is available.

Response example:

json
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1700000000,
  "model": "qwen3-235b-thinking",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of Norway is Oslo."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 24,
    "completion_tokens": 8,
    "total_tokens": 32
  }
}

Example:

bash
curl -X POST https://api.wayscloud.services/v1/chat/completions \
  -H "X-API-Key: wayscloud_llm_abc12_YOUR_SECRET" \
  -H "Content-Type: application/json" \
  -d '{
  "model": "qwen3-235b-thinking",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "What is the capital of Norway?"
    }
  ],
  "temperature": 0.7,
  "max_tokens": 256
}'

Response:

json
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1700000000,
  "model": "qwen3-235b-thinking",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of Norway is Oslo."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 24,
    "completion_tokens": 8,
    "total_tokens": 32
  }
}


WAYSCloud AS