This document describes the APIs for voice processing, covering Automatic Speech Recognition (ASR) and Text-to-Speech (TTS) services.
Each request to the developer API must include a bearer token in the Authorization header. This bearer token should be in the form of Authorization: Bearer {subcompany_takeover_secret} . This takeover secret can be found in the teamspace settings page.
Copy Authorization: Bearer <token> Text to Speech (TTS)
Convert text into spoken audio.
POST
Copy ${BASE_URL}/platform/v1/voice/{subcompany_id}/tts Headers
Copy Authorization: Bearer <token>
Content-Type: application/json Request body
Copy {
" text " : " string " (required) ,
" lang " : " rw | kj " (optional) ,
" response_format " : " mp3 | wav " (optional) ,
" speed " : 1.0 (optional) ,
" gender " : " female " (optional)
} Request constraints
Maximum text length: 5,000 characters
Supported formats: mp3 , wav
cURL example
Response
The API returns a Base64-encoded audio payload representing the generated audio file.
The response is not a downloadable file and must be decoded on the client side.
Decoding the response to MP3 (Windows)
If you are using Windows , you can decode the Base64 response into an MP3 file using PowerShell .
Copy the Base64 response into a file (for example, input.txt)
Run the following command in PowerShell:
This will generate an output.mp3 file in the same directory.
Automatic Speech Recognition (ASR)
Convert spoken audio into text.
POST
Headers
Request body
Request constraints
Supported formats: mp3 , wav
cURL example
Response
The API returns the transcribed text extracted from the uploaded audio file.
Both endpoints require a valid {subcompany_id}
Authentication uses a subcompany takeover token
TTS responses require client-side decoding
This API currently supports Kinyarwanda and Oshiwambo female voice