The <Speak> element reads the text as speech to the caller. It is very useful for dynamic text that cannot be pre-recorded.

Element Attributes

Attribute Name Description Allowed Values Default Value
language Language to be used for output "ja" (Japanese) or "en" (English). Not required if parameter voice is passed ja
voice Voice to be used for output The voice implies the language: nozomi (ja), seiji (ja), araki (ja), x-aitalk (ja), kal (en), awb (en), awb_time (en), kal16 (en), rms (en), slt (en) none
loop Number of time to repeat the output integer between 1 and 5 1
  • "x-aitalk": pseudo-voice to use x-aitalk-kana syntax
  • "en" voices are of low quality (use them only for tests)

Examples

Example 1: Hi this is Plivo

When a call is directed to the following XML document, the caller will hear "Hi this is Basix Plivo" spoken once"

<Response>
 <Speak voice="rms">Hi this is Basix Plivo.</Speak>
</Response>

Example 2: Hey, Hey, Hey

This XML document tells Plivo to say "Hey" thrice in a row.

<Response>
 <Speak voice="rms" loop="3">Hey</Speak>
</Response>

Example 3: Japanese

This XML document tells Plivo to say “おはようございます” using voice "nozomi" and then say "白い花が咲いている。赤い花も咲いている。どちらの花がすきですか。” using X-AITalk intonation syntax.

<Response>
    <Speak voice="nozomi">おはようございます</Speak>
    <Speak voice="x-aitalk"><![CDATA[ <S>シ^ロ!イ|ハ^ナ!ガ|_サ^イテイル!_2$ア^カ!イ|ハ^!ナモ|_サ^イテイル<F><S>ド^チ!ラノ^ハ^ナ|ガ^_!ス|キデスカ<R>]]></Speak>
</Response>

Obs: Data for voice="x-aitalk" must be enclosed in a CDATA section because it uses tags like <S> that would conflict with the surrounding XML document.

x-aitalkを指定することで、株式会社エーアイの中間言語(AIカナ)を利用できます。 中間言語(AI)は「韻律記号」でアクセント、ポーズ位置、ポーズ長などを、「制御タグ」で音声辞書、音量、話速、ポーズなどを指定できる独自規格で、とても細かな表現を実現します。