How to Play Media and use Text-to-Speech
In this guide we will show you how to play media and use text to speech for calls. Please ensure you have followed our earlier guide on how to make an outbound call with Bandwidth.
You may want to play media for on-hold music or use text-to-speech to play descriptive messages to your customers.
#
Play MediaThe BXML PlayAudio
verb is used to play an audio file in the call with the ability to play multiple audio files in succession. The audio file should already be hosted and the URL of an audio file should be included in the body of the <PlayAudio>
tag.
- XML
- Java
- C#
- Ruby
- NodeJS
- Python
- PHP
The second instance of PlayAudio (a relative endpoint) assumes there is an endpoint in the application that serves an audio file. To see an example, look here.
The second instance of PlayAudio (a relative endpoint) assumes there is an endpoint in the application that serves an audio file. To see an example, look here.
The second instance of PlayAudio (a relative endpoint) assumes there is an endpoint in the application that serves an audio file. To see an example, look here.
The second instance of PlayAudio (a relative endpoint) assumes there is an endpoint in the application that serves an audio file. To see an example, look here.
The second instance of PlayAudio (a relative endpoint) assumes there is an endpoint in the application that serves an audio file. To see an example, look here.
The second instance of PlayAudio (a relative endpoint) assumes there is an endpoint in the application that serves an audio file. To see an example, look here.
Once the call is created using our API we check the specific answerUrl
for a BXML response which tells us to play the media file.
In this example, two audio files are played for the caller; one from an absolute endpoint hosted somewhere other than where the application is, and one is a relative endpoint. The relative endpoint assumes there is an endpoint in the application that serves an audio file.
tip
ONLY .wav and .mp3 files are supported. To ensure playback quality, Bandwidth recommends limiting audio files to less than 1 hour in length or 250 MB in size.
#
Text-To-SpeechThe <SpeakSentence>
verb is used for text-to-speech playback on a call. Attributes of the speaker may be changed including the gender or locale of the speaker. The default speaker susan is a female speaker with locale en_US. All supported speakers can be viewed here.
- XML
- Java
- C#
- Ruby
- NodeJS
- Python
- PHP
In this example, once the call is created using our API we check the specific answerUrl
for a BXML response which tells us to playback the specified text.
Speech Synthesis Markup Language (SSML) tags allow you to use XML-based markup language for assisting the generation of synthesized speech providing you with additional functionality. Here, the name Sherlock Holmes is said with British inflection, and the date is pronounced as "November twelfth, twenty-twenty-two" instead of the numbers being read. All supported SSML tags can be viewed here.
#
Where to next?Now that you have made your first outbound call with playing media or text-to-speech, some of the available actions are available in the following guides: