Skip to content

Run an LLM Request

Send chat completion requests to the WAYSCloud LLM API. The API is OpenAI-compatible, so existing OpenAI SDK code works with a one-line change.

What you are building

A working integration with a language model API that supports chat completions, model selection, and per-token billing. All inference runs in EU datacenters.

When to use this approach

  • You need text generation, summarization, or analysis
  • You want OpenAI SDK compatibility with EU-hosted inference
  • You need per-token usage tracking and billing

If you need image or video generation, use GPU Studio. For pre-built chatbots with a knowledge base, use Chatbot.

What you need

  • A WAYSCloud account
  • An LLM API key (generated during activation)

Step 1 — Activate the LLM service

Dashboard

  1. Open ServicesAI & Machine LearningLLM API in the dashboard
  2. Click Activate
  3. Copy your API key — it is only shown once

API

bash
curl -X POST https://api.wayscloud.services/v1/dashboard/llm/activate \
  -H "Authorization: Bearer $JWT_TOKEN"

Step 2 — Send a chat completion request

The LLM API is OpenAI-compatible. Use it with cURL, the OpenAI SDK, or any HTTP client.

cURL:

bash
curl -X POST https://api.wayscloud.services/v1/llm/chat/completions \
  -H "Authorization: Bearer $LLM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-3-sonnet",
    "messages": [
      {"role": "user", "content": "Explain DNS in one paragraph."}
    ],
    "max_tokens": 500
  }'

Python (OpenAI SDK):

python
from openai import OpenAI

client = OpenAI(
    api_key="wayscloud_api_abc123_yourkey",
    base_url="https://api.wayscloud.services/v1/llm"
)

response = client.chat.completions.create(
    model="claude-3-sonnet",
    messages=[
        {"role": "user", "content": "Explain DNS in one paragraph."}
    ],
    max_tokens=500
)

print(response.choices[0].message.content)

Python (requests):

python
import requests

resp = requests.post(
    "https://api.wayscloud.services/v1/llm/chat/completions",
    headers={"Authorization": "Bearer wayscloud_api_abc123_yourkey"},
    json={
        "model": "claude-3-sonnet",
        "messages": [{"role": "user", "content": "Explain DNS in one paragraph."}],
        "max_tokens": 500
    }
)

print(resp.json()["choices"][0]["message"]["content"])

Step 3 — Check usage and quota

Monitor your token consumption:

bash
curl https://api.wayscloud.services/v1/dashboard/llm/quota \
  -H "Authorization: Bearer $JWT_TOKEN"

Response:

json
{
  "plan_name": "LLM Pro - 1M tokens/month",
  "usage_current_month": {
    "total_requests": 1523,
    "total_input_tokens": 450230,
    "total_output_tokens": 285670,
    "total_cost": 892.45,
    "currency": "NOK"
  }
}

Step 4 — Next steps

  • List available modelsGET /v1/dashboard/llm/console/models
  • Adjust parameters — Set temperature, top_p, and max_tokens per request
  • Track usage by modelGET /v1/dashboard/llm/usage/by-model
  • Build a chatbot — Combine LLM with a knowledge base. See Create a Chatbot.
  • Transcribe audio — Feed transcriptions to the LLM. See Transcribe Audio.

API reference

WAYSCloud AS