Chat Completions
The chat completions endpoint is the core of the LLM API, allowing you to have conversations with AI models.
Endpoint
POST /v1/chat/completions
POST /v1/llm/chat (WAYSCloud native)
Request Format
{
"model": "mixtral-8x7b",
"messages": [
{"role": "system", "content": "You are a helpful assistant"},
{"role": "user", "content": "Hello!"}
],
"temperature": 0.7,
"max_tokens": 1000,
"top_p": 0.9,
"stream": false
}
Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| model | string | Yes | - | Model ID (see Models) |
| messages | array | Yes | - | Conversation history |
| temperature | float | No | 0.7 | Randomness (0.0-2.0) |
| max_tokens | integer | No | 1000 | Maximum tokens to generate |
| top_p | float | No | 1.0 | Nucleus sampling (0.0-1.0) |
| stream | boolean | No | false | Enable streaming |
| stop | string/array | No | null | Stop sequences |
| presence_penalty | float | No | 0.0 | Penalize new topics (-2.0 to 2.0) |
| frequency_penalty | float | No | 0.0 | Penalize repetition (-2.0 to 2.0) |
Message Roles
system- Instructions for the model's behavioruser- User messagesassistant- Assistant responses (for multi-turn conversations)
Example Requests
Simple Request
curl -X POST "https://api.wayscloud.services/v1/chat/completions" \
-H "Authorization: Bearer wayscloud_llm_abc123_YourSecretKey" \
-H "Content-Type: application/json" \
-d '{
"model": "mixtral-8x7b",
"messages": [
{"role": "user", "content": "What is the capital of Norway?"}
]
}'
With System Prompt
curl -X POST "https://api.wayscloud.services/v1/chat/completions" \
-H "Authorization: Bearer $WAYSCLOUD_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-80b-instruct",
"messages": [
{"role": "system", "content": "You are a Norwegian history expert. Always respond in Norwegian."},
{"role": "user", "content": "Tell me about Viking Age"}
],
"temperature": 0.7,
"max_tokens": 500
}'
Multi-Turn Conversation
{
"model": "mixtral-8x7b",
"messages": [
{"role": "system", "content": "You are a helpful coding assistant"},
{"role": "user", "content": "How do I read a file in Python?"},
{"role": "assistant", "content": "You can use `open()` function: `with open('file.txt') as f: content = f.read()`"},
{"role": "user", "content": "How do I write to a file?"}
]
}
Python Examples
Basic Request
import requests
import os
API_KEY = os.getenv('WAYSCLOUD_API_KEY')
response = requests.post(
'https://api.wayscloud.services/v1/chat/completions',
headers={
'Authorization': f'Bearer {API_KEY}',
'Content-Type': 'application/json'
},
json={
'model': 'mixtral-8x7b',
'messages': [
{'role': 'user', 'content': 'Hello!'}
]
}
)
result = response.json()
print(result['choices'][0]['message']['content'])
Using OpenAI SDK
from openai import OpenAI
import os
client = OpenAI(
api_key=os.getenv('WAYSCLOUD_API_KEY'),
base_url='https://api.wayscloud.services/v1'
)
response = client.chat.completions.create(
model='mixtral-8x7b',
messages=[
{'role': 'user', 'content': 'Hello!'}
]
)
print(response.choices[0].message.content)
Conversation Manager
class ChatSession:
def __init__(self, api_key, model='mixtral-8x7b', system_prompt=None):
self.api_key = api_key
self.model = model
self.messages = []
if system_prompt:
self.messages.append({'role': 'system', 'content': system_prompt})
def send(self, user_message):
self.messages.append({'role': 'user', 'content': user_message})
response = requests.post(
'https://api.wayscloud.services/v1/chat/completions',
headers={
'Authorization': f'Bearer {self.api_key}',
'Content-Type': 'application/json'
},
json={
'model': self.model,
'messages': self.messages
}
)
result = response.json()
assistant_message = result['choices'][0]['message']['content']
self.messages.append({'role': 'assistant', 'content': assistant_message})
return assistant_message
# Usage
chat = ChatSession(
api_key=os.getenv('WAYSCLOUD_API_KEY'),
system_prompt='You are a helpful Python programming assistant'
)
print(chat.send('How do I make HTTP requests?'))
print(chat.send('Show me an example with error handling'))
JavaScript Example
const axios = require('axios');
async function chat(message) {
const response = await axios.post(
'https://api.wayscloud.services/v1/chat/completions',
{
model: 'mixtral-8x7b',
messages: [{ role: 'user', content: message }]
},
{
headers: {
'Authorization': `Bearer ${process.env.WAYSCLOUD_API_KEY}`,
'Content-Type': 'application/json'
}
}
);
return response.data.choices[0].message.content;
}
// Usage
const answer = await chat('What is the capital of Norway?');
console.log(answer);
Response Format
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1699012345,
"model": "mixtral-8x7b",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The capital of Norway is Oslo."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 15,
"completion_tokens": 8,
"total_tokens": 23
}
}
Finish Reasons
stop- Model finished naturallylength- Hit max_tokens limitcontent_filter- Content filtered
Error Responses
400 Bad Request
{
"error": {
"message": "Invalid model specified",
"type": "invalid_request_error",
"code": "invalid_model"
}
}
429 Rate Limit
{
"error": {
"message": "Rate limit exceeded",
"type": "rate_limit_error"
}
}
Best Practices
- Set max_tokens to prevent excessive costs
- Use system prompts for consistent behavior
- Implement retry logic for rate limits
- Cache responses when appropriate
- Monitor token usage for cost control