Voice API

This document describes the APIs for voice processing, covering Automatic Speech Recognition (ASR) and Text-to-Speech (TTS) services.

Authentication

Each request to the developer API must include a bearer token in the Authorization header. This bearer token should be in the form of Authorization: Bearer {subcompany_takeover_secret}. This takeover secret can be found in the teamspace settings page.

Authorization: Bearer <token>

Voice API

Text to Speech (TTS)

Convert text into spoken audio.

POST

${BASE_URL}/platform/v1/voice/{subcompany_id}/tts

Headers

Authorization: Bearer <token>
Content-Type: application/json

Request body

{
  "text": "string" (required),
  "lang": "rw | kj" (optional),
  "response_format": "mp3 | wav" (optional),
  "speed": 1.0 (optional),
  "gender": "female" (optional)
}

Request constraints

Maximum text length: 5,000 characters
Supported languages:
- rw (Kinyarwanda)
- kj
Speed range: 0.25 – 4.0
Supported formats: mp3, wav
Timeout: 60 seconds

cURL example

curl -X POST "${BASE_URL}/platform/v1/voice/{subcompany_id}/tts" \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Muraho neza",
    "lang": "rw",
    "response_format": "mp3",
    "speed": 1.0,
    "gender": "female"
  }'

Response

The API returns a Base64-encoded audio payload representing the generated audio file.

The response is not a downloadable file and must be decoded on the client side.

Decoding the response to MP3 (Windows)

If you are using Windows, you can decode the Base64 response into an MP3 file using PowerShell.

Copy the Base64 response into a file (for example, input.txt)
Run the following command in PowerShell:

$b64 = Get-Content input.txt -Raw
$b64 = $b64 -replace '\s',''
[IO.File]::WriteAllBytes("output.mp3", [Convert]::FromBase64String($b64))

This will generate an output.mp3 file in the same directory.

Automatic Speech Recognition (ASR)

Convert spoken audio into text.

POST

${BASE_URL}/platform/v1/voice/{subcompany_id}/asr

Headers

Authorization: Bearer <token>
Content-Type: multipart/form-data

Request body

file (required)

Request constraints

Maximum file size: 15 MB
Supported formats: mp3, wav
Timeout: 60 seconds

cURL example

curl -X POST "${BASE_URL}/platform/v1/voice/{subcompany_id}/asr" \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: multipart/form-data" \
  -F "[email protected]"

Response

The API returns the transcribed text extracted from the uploaded audio file.

Notes

Both endpoints require a valid {subcompany_id}
Authentication uses a subcompany takeover token
TTS responses require client-side decoding
This API currently supports Kinyarwanda and Oshiwambo female voice

PreviousDeveloper API NextOn-Premise & Hybrid Hosting

Last updated 29 days ago

hashtagAuthentication

hashtagVoice API

hashtagText to Speech (TTS)

hashtagAutomatic Speech Recognition (ASR)

hashtagNotes

Authentication

Voice API

Text to Speech (TTS)

Automatic Speech Recognition (ASR)

Notes