Transcribe Audio

Upload audio or video files and receive accurate transcriptions with speaker detection, word-level timestamps, and export in multiple formats.

What you are building

A transcription pipeline where you upload a file, the system processes it with speech recognition, and you get back timestamped text with speaker labels.

When to use this approach

Meeting recordings, interviews, or podcasts
Video subtitling (export as SRT or VTT)
Audio content indexing and search
Compliance recordings that need written records

What you need

A WAYSCloud account
An audio or video file (max 2 GB)

Step 1 — Create a transcription job

Dashboard

Open Services → AI & Machine Learning → Speech Intelligence in the dashboard
Click New Transcription
Upload your file
Select language (or leave on auto-detect)

API

Request a presigned upload URL:

bash

curl -X POST https://api.wayscloud.services/v1/dashboard/transcript/jobs \
  -H "Authorization: Bearer $JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "filename": "meeting-2025-12-15.mp3",
    "content_type": "audio/mpeg",
    "file_size_bytes": 52428800,
    "language": "auto"
  }'

Response:

json

{
  "job_id": "job-abc123",
  "upload_url": "https://storage.wayscloud.services/uploads/job-abc123?X-Amz-Signature=...",
  "upload_expires_in": 3600
}

Upload the file to the presigned URL:

bash

curl -X PUT "https://storage.wayscloud.services/uploads/job-abc123?X-Amz-Signature=..." \
  -H "Content-Type: audio/mpeg" \
  --data-binary @meeting-2025-12-15.mp3

Confirm the upload:

bash

curl -X POST https://api.wayscloud.services/v1/dashboard/transcript/jobs/job-abc123/upload-complete \
  -H "Authorization: Bearer $JWT_TOKEN"

Step 2 — Wait for processing

The job moves through: queued → processing → completed.

Poll the job status:

bash

curl https://api.wayscloud.services/v1/dashboard/transcript/jobs/job-abc123 \
  -H "Authorization: Bearer $JWT_TOKEN"

Response (completed):

json

{
  "job_id": "job-abc123",
  "status": "completed",
  "language_detected": "en",
  "duration_seconds": 3600,
  "word_count": 4523,
  "segments": [
    {
      "start_time": 0.0,
      "end_time": 5.5,
      "text": "Hello, welcome to the meeting.",
      "confidence": 0.98,
      "speaker_id": 1
    }
  ]
}

Step 3 — Export the result

Generate an export in your preferred format:

bash

curl -X POST https://api.wayscloud.services/v1/dashboard/transcript/jobs/job-abc123/export \
  -H "Authorization: Bearer $JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"format": "srt"}'

Response:

json

{
  "artifact_id": "artifact-xyz",
  "format": "srt",
  "download_url": "https://storage.wayscloud.services/artifacts/artifact-xyz.srt"
}

Supported formats: txt (plain text), json (full data), srt (subtitles), vtt (web subtitles).

Step 4 — Next steps

Analyze with AI — Feed transcriptions to the LLM API for summarization. See Run an LLM Request.
Store results — Save transcription files in Object Storage. See Store Files.
Automate — Use the API to build automated transcription pipelines

Speech Intelligence — full service documentation
LLM API — process transcriptions with language models
Object Storage — store audio files and results

Authentication — API key setup

API reference

Speech Intelligence API — all transcription endpoints

Transcribe Audio ​

What you are building ​

When to use this approach ​

What you need ​

Step 1 — Create a transcription job ​

Dashboard ​

API ​

Step 2 — Wait for processing ​

Step 3 — Export the result ​

Step 4 — Next steps ​

Related services ​

Related guides ​

API reference ​

Transcribe Audio

What you are building

When to use this approach

What you need

Step 1 — Create a transcription job

Dashboard

API

Step 2 — Wait for processing

Step 3 — Export the result

Step 4 — Next steps

Related services

Related guides

API reference