Run an LLM Request

Send chat completion requests to the WAYSCloud LLM API. The API is OpenAI-compatible, so existing OpenAI SDK code works with a one-line change.

What you are building

A working integration with a language model API that supports chat completions, model selection, and per-token billing. All inference runs in EU datacenters.

When to use this approach

You need text generation, summarization, or analysis
You want OpenAI SDK compatibility with EU-hosted inference
You need per-token usage tracking and billing

If you need image or video generation, use GPU Studio. For pre-built chatbots with a knowledge base, use Chatbot.

What you need

A WAYSCloud account
An LLM API key (generated during activation)

Step 1 — Activate the LLM service

Dashboard

Open Services → AI & Machine Learning → LLM API in the dashboard
Click Activate
Copy your API key — it is only shown once

API

bash

curl -X POST https://api.wayscloud.services/v1/dashboard/llm/activate \
  -H "Authorization: Bearer $JWT_TOKEN"

Step 2 — Send a chat completion request

The LLM API is OpenAI-compatible. Use it with cURL, the OpenAI SDK, or any HTTP client.

cURL:

bash

curl -X POST https://api.wayscloud.services/v1/llm/chat/completions \
  -H "Authorization: Bearer $LLM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-3-sonnet",
    "messages": [
      {"role": "user", "content": "Explain DNS in one paragraph."}
    ],
    "max_tokens": 500
  }'

Python (OpenAI SDK):

python

from openai import OpenAI

client = OpenAI(
    api_key="wayscloud_api_abc123_yourkey",
    base_url="https://api.wayscloud.services/v1/llm"
)

response = client.chat.completions.create(
    model="claude-3-sonnet",
    messages=[
        {"role": "user", "content": "Explain DNS in one paragraph."}
    ],
    max_tokens=500
)

print(response.choices[0].message.content)

Python (requests):

python

import requests

resp = requests.post(
    "https://api.wayscloud.services/v1/llm/chat/completions",
    headers={"Authorization": "Bearer wayscloud_api_abc123_yourkey"},
    json={
        "model": "claude-3-sonnet",
        "messages": [{"role": "user", "content": "Explain DNS in one paragraph."}],
        "max_tokens": 500
    }
)

print(resp.json()["choices"][0]["message"]["content"])

Step 3 — Check usage and quota

Monitor your token consumption:

bash

curl https://api.wayscloud.services/v1/dashboard/llm/quota \
  -H "Authorization: Bearer $JWT_TOKEN"

Response:

json

{
  "plan_name": "LLM Pro - 1M tokens/month",
  "usage_current_month": {
    "total_requests": 1523,
    "total_input_tokens": 450230,
    "total_output_tokens": 285670,
    "total_cost": 892.45,
    "currency": "NOK"
  }
}

Step 4 — Next steps

List available models — GET /v1/dashboard/llm/console/models
Adjust parameters — Set temperature, top_p, and max_tokens per request
Track usage by model — GET /v1/dashboard/llm/usage/by-model
Build a chatbot — Combine LLM with a knowledge base. See Create a Chatbot.
Transcribe audio — Feed transcriptions to the LLM. See Transcribe Audio.

LLM API — full service documentation
GPU Studio — image and video generation
Speech Intelligence — audio transcription
Chatbot — pre-built AI chatbot with RAG

Authentication — API key setup
Python Integration — Python examples

API reference

LLM API — all LLM endpoints

Run an LLM Request ​

What you are building ​

When to use this approach ​

What you need ​

Step 1 — Activate the LLM service ​

Dashboard ​

API ​

Step 2 — Send a chat completion request ​

Step 3 — Check usage and quota ​

Step 4 — Next steps ​

Related services ​

Related guides ​

API reference ​

Run an LLM Request

What you are building

When to use this approach

What you need

Step 1 — Activate the LLM service

Dashboard

API

Step 2 — Send a chat completion request

Step 3 — Check usage and quota

Step 4 — Next steps

Related services

Related guides

API reference