Inbound XML

Last updated: Mar 4th, 2024

Inbound XML

The API interprets the InboundXML document based on its markup. The markup, composed of basic XML elements, contains the instructions that determine how RooR API should respond. All InboundXML elements are camelcase and case-insensitive and are organized into Verbs or Nouns. Verbs - Elements that define the behavior of the call or text. Nouns - Elements that define the specifics of the behavior. These are always nested inside of verbs and they can be XML elements or plain text. For simplicity, the Verbs are referred to as Voice Elements, and the Nouns are the attribute parameters for the Voice Elements. All supported InboundXML elements are documented below. For each Voice Element definition, there is:

  1. A description/purpose of what the element does
  2. A table of voice element attributes (nouns)
  3. Information on how the voice element can be nested, if applicable
  4. A code snippet
  5. A list of usage tips, if applicable
Note : The following special characters are not supported in the URL query string options: ' " # { } < > ? and will be stripped from the URL during parsing.

Methods (calls)

<Say>

The <Say> element reads text to the caller using a text-to-speech engine. <Say> is good to use with dynamic data, while <Play> may be a better choice for static information or prompts. The text to be read is nested within the <Say> element.

Nesting
In addition to the default element, the element can also be nested within the verb. The <Say> element cannot nest any other elements within itself. It must only nest the text which will be read to the caller.
General Example
The InboundXML below will first say "Hello" in a woman's voice three times, then "Hello, my name is Jane" in a woman's voice one time, then "Now I will not stop talking" repeated until the caller hangs up.


Parameters
Field Required Description
voice optional The language, type, and gender of the voice that will read the text to the caller
Note:
Valid values: See the voice example below
Default value:
loop optional The amount of times the spoken text should be repeated
Note:
Valid values: integer greater than or equal to 1.
Default value: 1

Usage Example
Say Example:	<Response>
				<Say loop='3' voice='woman'>Hello</Say>
				<Say voice='woman'>Hello, my name is Jane.</Say>
				<Say loop='1'>Now I will not stop talking.</Say>
				</Response>

Sample Inline Response
<calldetails>

<Play>

The verb plays an audio file back to the caller. RooR retrieves the file from a URL that you provide. The media URL is provided between 's opening and closing tags, as shown in the example below.


Parameters
Field Required Description
loop optional The amount of times the should be repeated. 1 indicates an infinite loop.
Note:
Valid values: integer greater than or equal to 1
Default value: 1

Usage Example
<?xml version="1.0" encoding="UTF-8"?>
<Response>
    <Play>https://docs.gynetix.com/sounds/cowbell.mp3</Play>
</Response>

<Gather>

The element allows callers to input digits to the call using keypads which are then sent via POST or GET to a URL for processing.

There are many ways to get creative with but its most common use case is in creating IVR menus. This is done by nesting prompts for input from the caller using the or elements. Only a single or element can be nested in a tag.

By default an unlimited number of digits can be gathered. The will timeout after 5 seconds pass without any new digits or once the '#' key is pressed, then the gathered digits will be submitted to the current InboundXML document. This default behavior of can be altered using its provided element attributes.



Nesting

The element cannot be nested within any other verbs besides the default element.

The and elements MUST nested within the element.

<Response>
<Gather action='https://....' method='GET' numDigits='4' finishOnKey='#'>
<Say>Please enter your 4 digit pin</Say>
</Gather>
</Response>


Parameters
Field Required Description
action optional The URL where the flow of the call and the gathered digits will be forwarded to.
Note:
Valid values: integer greater than or equal to 1
Default value: 1
method optional Method used to request the action URL.
Note:
Valid values: POST, GET
Default value: POST
timeout optional The number of seconds should wait for digits to be entered before requesting the action URL. Timeout resets with each new digit input.
Note:
Valid values: integer greater than or equal to 0
Default value: 5
finishOnKey optional The key a caller can press to end the
Note:
Valid values: digits 0 to 9, #, or *
Default value: #
numDigits optional The maximum number of digits to .
Note:
Valid values: integer greater than or equal to 1
Default value: 1
input optional Specify which inputs (DTMF or speech) RooR should accept with the input attribute.
Note: The default input for is dtmf. You can set input to dtmf, speech, or dtmf speech. Please note that speech recognition is not yet optimized for Alphanumeric inputs (e.g. ABC123), this could lead to inaccurate results and thus, we do not recommend it. If you set dtmf speech for your input, the first detected input (speech or dtmf) will take precedence. If speech is detected first, finishOnKey will be ignored. The following example shows a that specifies speech input from the user. When this InboundXML executes, the caller will hear the prompt. RooR will then collect speech input. Once the caller stops speaking for five seconds, RooR posts their transcribed speech to your action URL.
Valid values: dtmf, speech, dtmf speech
Default value: dtmf

Usage Example
<?xml version="1.0" encoding="UTF-8"?>
<Response>
    <Gather input="speech" action="https://...">
           <Say>Welcome to Twilio, please tell us why you're calling</Say>
        </Gather>
</Response>

<Record>

The element is used to record audio during a call. It can occur anywhere within an InboundXML document but will only begin recording once it has been reached. This means it would have to be the first element after for the entire call to be recorded.

When the recording is complete, a URL of the recorded audio is created and submitted as a GET or POST to the action URL. Similar to the element, a timeout value sets how much silence to allow before the recording ends, maxLength sets how long the recording may be, and the finishOnKey is used to set which keys will end the recording.

By default, the action and method specify that should make a POST to the URL of the current InboundXML document.

###Element Attributes


Nesting

The element cannot be nested within any other verbs besides the default element. The element cannot nest any other elements within itself.

<Response>
<Say>Please state your name.</Say>
<Record background="false" action="https://..." timeout="100" method="POST" finishOnKey="#"></Record>
</Response>


Parameters
Field Required Description
action optional URL where some parameters specific to will be sent for further processing.
method optional Method used to request the action URL.
Valid values: POST, GET
Default value: POST
timeout optional The number of seconds should wait during silence before ending.
Valid values: integer greater than or equal to 1
Default value: 5
finishOnKey optional The key a caller can press to end the
Valid values: digits 0 to 9, #, or *
Default value: #
maxLength optional The maximum length in seconds a recording should be.
Valid values: integer greater than or equal to 1
Default value: 3600
playBeep optional Boolean value specifying if a beep should be played when the recording begins.
Valid values: true, false
Default value: false
background optional Begin recording the call while continuing the execution of any other present InboundXML in the background (true) or block the execution of subsequent InboundXML until the record element finishes (via finishOnKey or timeout). Note that the timeout, finishOnKey, and playBeep attributes have no effect when the background is set to true.
Valid values: true, false
Default value: true
trimSilence optional Trims all silence from the beginning of the recording. Any other value will default to "false".
Valid values: true, false
Default value: true

<Dial>

The element starts an outgoing dial from the current call. Once the dial is complete, the next element in the InboundXML document will be processed unless the action attribute is set. In that case, the result of the dial is submitted as a GET or POST (depending on the method attribute) to the action URL, and the call continues using the InboundXML of that URL.

By default the outgoing call will timeout if it is not answered after 60 seconds. However, the timeout attribute can be used to set a custom time. The length of the call is limited by the timeLimit attribute which is 4 hours (14400 seconds) by default.

The callerId attribute can be set to any number and will default to the caller ID of the original caller. The number to be dialed should be nested within the element.

In it's most basic form the tag will look like:



+15557774545

Nesting

The element can't be nested within any other verbs besides the default element.

The element can be nested within the element.

<Dial> Example

<Response>
<Dial action="https://..."
callerid="949XXXYYYY">714XXXYYYY
</Dial>
</Response>



Parameters
Field Required Description
action optional URL where some perameters specific to will be sent for furhter processing. The calling party can be redirected here upon the hangup of the B leg caller.
method optional Method used to request the action URL.
Valid values: Post, Get
Default value: Post
timeout optional Number of seconds call stays on the line while waiting for an answer.
Default value: 60
timeLimit optional The duration in deconds a call made through should occur for before ending.
Note:
Valid values: integer greater than or equal to 1
Default value: 14400
callerId optional Number to display as calling. Defaults to the ID of the phone being used.
Default value: Only valid US number
hideCallerId optional Boolean value specifying if the caller ID should be hidden or not.
Note: If value is set to TRUE, then caller ID will be hidden.
Valid values: TRUE, FALSE
Default value: FALSE
dialMusic optional Audio URL to be executed in place of the call ring-tone.
Valid values: mp3, wav
callbackURL optional URL requested once the dialed call connects. Note that this url only receives paramerters containing information about the call. The call does not execute XML given as a callback URL.
callbackMethod optional Method used to request the callback URL.
Valid values: POST, GET
Default value: POST
confirmSound optional Boolean value specifying if a sounds should play when dial is successful.
Valid values: TRUE, FALSE
Default value: FALSE
heartbeatUrl optional The URL that Roor API requests every 60 seconds during the call to notify of elapsed time as well as to pass other general information.
heartbeatMethod optional Method used to request heartbeatUrl.
Valid values: POST, GET
Default value: POST
groupConfirmKey optional This is the single digit numeric value to be pressed to accept the call.
Note: Use this attribute when you are using multiple to numbers.
Valid values: digits 0 to 9, #, *
Default value: 1
groupConfirmFile optional Audio file URL which can be played adfter answering the call to accept the call
Note: Use this attribute when you are using multiple to numbers
Valid values: Any audio file URL with mp3 or wav format.
Default value: https://...
onAnswerPlay optional Before the dial action is performed, this plays an audio to leg B to leg B of the call (The number being called in the tag). The tag can include either onAnswerPlay or onAnswerSay but not both.
Valid values: audio file URL. 8bit mono 8000Hz mu-law .mp3 or .wav format.
onAnswerSay optional Before the dial action is performed, this reads text to Leg B to leg B of the call (The nubmer being called in teh tag) using a text-to-speech engine. The tag can include either onAnswerPlay or onAnswerSay but not both.
Valid values: string
CallerName optional A String passed in the SIP header that will display on Softphones.
Valid values: string (max 25 chars)
PlayDTMF optional A set of DTMF digits that will play when Leg B of hte call answers.
Valid values: digits 0 to 9, #, *
PlayDTMFDelay optional The amount of time in seconds the system will wait befoer executing PlayDTMF
Valid values: decimal (In half second intervals)

Usage Example
<?xml version="1.0" encoding="UTF-8"?>
<Response>
    <Dial>+15555555555</Dial>
    <Say>Goodbye</Say>
</Response>

<Connect> Bi-directional Streams

Bi-directional Media Streams:
If you want to send media back to the call, the Stream *must* be bi-directional. To do this initialize the stream using the <Connect> verb. The <Stream> noun's url attribute must be set to a secure websocket server (wss).

Nestable:

You can nest the <Parameter> noun inside the stream to pass custom information to the web socket connection.


Parameters
Field Required Description
url Required URL must be set to a secure websocket server (wss)
Note:
Valid values: wss
Default value:
streamType optional Setting the stream type you can either get MULAW or Signed Linear 16 bit both are at 8Khz
Note:
Valid values: ulaw or slin8
Default value: ulaw
noiseFilter optional Setting this to on will tell RooR to block out background noises
Note:
Valid values: off or on
Default value: off
name optional Give this stream a name
Note:
Valid values:
Default value:
track optional The track that will be sent to your web socket. Bi directional stream only support the inbound_track
Note:
Valid values: inbound_track
Default value: inbound_track

Usage Example
<?xml version="1.0" encoding="UTF-8"?>
<Response>
   <Connect>
       <Stream url="wss://..." />
			<Parameter name="FirstName" value="Jane" />
            <Parameter name="LastName" value="Doe" />
   </Connect>
</Response>

Sample Inline Response
<calldetails>