A tutorial on Call Control XML and voice browser call control (2023)

In the last article about SCXML
we talked about state machine notations and noted that many concepts were taken
from Call Control XML (CCXML) language. The first draft of CCXML specification
appeared in the early 2002, and still remains in a working draft status.
However, lack of final recommendation is not an obstacle for telecom industry,
and there already exist several CCXML implementations in products like OpenCall Media Platform from
Hewlett Packard or the open-source Asterisk platform.

Call Control XML is designed to provide telephony call
control support for dialog systems, such as VoiceXML It also
can be used as a third-party call control manager in any telephony system. The CCXML 1.0 specification defines
both a state machine and event handing syntax and a standardized set of call
control elements.

In this document, we will look at some of VoiceXML’s
capabilities and limitations, as well as details on how VoiceXML and CCXML can
be integrated within an application. Telephone applications need to receive and
process large numbers of events in real-time. These events arrive from outside
the program itself—either the underlying telephony platform, or from other
sources of events. A CCXML program includes event handlers which are executed
when certain events arrive. CCXML also provides a powerful and flexible method
of creating multi-party calls.

Note: The downloadable version of
this article contains all of the code examples in easy to use text files.

Main concepts and terms

be used as a third-party call control manager in any telephony system.
Originally there was an intention to add new tags to VoiceXML to support the
new features. However, specification designers repeatedly encountered conflicts
between the design goals for VoiceXML, and the requirements for CCXML. CCXML
and VoiceXML implementations are not mutually dependent. A CCXML implementation
may or may not support voice dialogs, or may support dialog languages other
than VoiceXML.

following requirements were addressed by CCXML:

  • Ability to give each active “call
    leg” (a real-world phone connection) its own dedicated VoiceXML
    interpreter. Currently, the second leg of a transferred call lacks a VoiceXML
    interpreter of its own, limiting the scope of possible applications.
  • Support for multi-party conferencing, with
    advanced conference and audio control. A conferencing application involves
    multiple participants, and is dependent upon call control to establish
    relationships between those participants.
  • Sophisticated multiple-call handling and
    control, including the ability to place outgoing calls.
  • Handling for a richer class of
    asynchronous events. Advanced telephony operations involve substantial amounts
    of signals, status events, and message-passing. VoiceXML 2.0 does not integrate
    asynchronous “external” events into its event-processing model.
  • The ability to receive events and messages
    from external computational entities. Interacting with an outside call queue,
    or placing calls on behalf of a document server, means that VoiceXML must have
    additional external interfaces.

application itself is a collection of CCXML documents that together create a
complete application/program. A single instance of a CCXML application is
called a CCXML Session. One session can span multiple documents and phone
calls. A CCXML connection can be a “call leg” or a system resource to
facilitate interaction with a Voice Dialog.

streams between Connections, or between Connections and Conference objects,
need to be tracked by the CCXML interpreter and will take real system
resources. A Voice Dialog, when active, is associated with a specific
Connection by which the Voice Dialog may interact with one-way or two-way media
streams from other Connections or a Conference Object. A Conference Object
models a resource for mixing media streams.

programs manipulate these entities through elements defined in the CCXML
language. They can also send and/or receive asynchronous events associated with
these entities. CCXML programs directly manipulate Connection Objects and
Conference Objects with various elements in the language which will be
described below. CCXML may also receive events from Connection and Conference Objects,
in the case of line signaling, line-status informational messages, or error and
failure scenarios. CCXML programs can start and kill Voice Dialogs using
language elements as well. Any other interaction takes place through the event
mechanism. CCXML Sessions can both send and receive events between one another.

An event
in CCXML is an action or occurrence to which an application can respond.
Examples of events are incoming phone calls, dialog actions or user defined
events. Events in CCXML are modeled as ECMAScript
objects and can contain complex values.

(Video) VoiceXML

Call Control structure in real

Now let’s have a look at some real examples and try to understand what is done there. The first example (Example1.txt) is a simple “hello world” CCXML application that is started due to an incoming call where the application simply assigns a value to a variable, prints a message to the platform log and exits.

Example 1

<?xml version=”1.0″ encoding=”us-ascii”?>
<scxml version=”1.0″ xmlns=”http://www.w3.org/2005/07/scxml” initialstate=”S1″>
<state id=”S1″>
<transition event=”Event1″ target=”S2″/>

<state id=”S2″>
<transition event=”Event2″ cond=”X>0″ target=”S1″/>
<transition event=”Event2″ cond =”X<0″ target next=”S3″/>

<state id=”S3″>


The <ccxml> is the parent element of a CCXML
document and encloses the entire CCXML script in a document. When a <ccxml> is executed, its child elements are collected
logically together at the beginning of the document and executed in document
order before the target <eventprocessor>. This is called document
initialization. The <eventprocessor> acts as a container for <transition> elements. A valid CCXML document
MUST only have a single <eventprocessor> element, and in chain it can
contain only <transition> elements.

content of a <transition> specifies the actions to be taken when it is selected. Its
“event” attribute is a pattern that indicates a matching event type.
Event types are dot-separated strings of arbitrary length. Variables are
declared using the <var> element and are initialized with the results of evaluating the optional
“expr” attribute as an ECMAScript expression. The values
of variables may be subsequently changed with <assign>element. <log> allows an application to generate a logging or debug message which a
developer can use to help in application development or post-execution analysis
of application performance. The manner in which the message is displayed or
logged is platform-dependent. <exit> ends execution of the CCXML session. All pending events are discarded,
and there is no way to restart CCXML execution.

Event Handling

Event Handling is one of the most powerful features of
CCXML. CCXML events can be delivered at any time and from a variety of sources.
This flexible event-handling mechanism is essential for many telephony

Each running CCXML interpreter has a queue, into which
it places incoming events, and sorts them by arrival time. A CCXML programmer
can only gain access to these queued events by using the <eventprocessor> element with associated <transition> elements. The CCXML session event queue generally
operates in a First In, First Out (FIFO) manner with
events to be processed being removed from the head and new events being placed
at the tail. There are two exceptions to this behavior: events where a time
delay has been specified, and certain special events that are always placed at
the head of the queue.

An event can be delivered to a CCXML session using a <send> element in which case an optional delay may be specified. When a delay
is specified the event is delivered to the target CCXML session but it is not
placed on to the event queue until the delay time has elapsed. When the delay
has elapsed the event is placed at the tail of the queue.

An <eventprocessor> is interpreted by an implicit Event
Handler Interpretation Algorithm (EHIA). The EHIA’s
main loop removes the first event from the CCXML session’s event queue, and
then selects from the set of <transition>s contained in the <eventprocessor>. A <transition> always indicates a set of accepted
event types, and MAY indicate a further ECMAScript
conditional expression to be evaluated. The <transition> that accepts the type for the just-removed event, has
a satisfied conditional expression, and appears first in the <eventprocessor> in document order, is the selected <transition>.

If an event is not selected by any <transition> within the <eventprocessor>, the CCXML platform SHOULD log the event using the
“missed” label. The CCXML platform can configure the
“missed” label for any desired disposition. This should be (however, not a must)
equivalent to the transition presented in Default-transition.txt.

(Video) Switchvox - Complete Call Control Feature

Default Transition

<transition event=”*” name=”ev”>
<log label=”‘missed'” expr=”ev.toString()”/>

Session Life-Cycle

A CCXML session life-cycle requires more clarification
now. A CCXML session can be started for the following reasons:

  • A new incoming phone call coming into the platform.
  • A CCXML application executing a <createccxml>.
  • An external session launch request coming into the platform.

When a session is started due to an incoming call, it
has ownership of the event endpoint associated with the new Connection. The new
CCXML session will be responsible for processing the Connection state events
and performing the Connection actions.

Every CCXML session has a set of standard ECMAScript variables that are available to the program
during execution called session variables. The session variables are defined by
the CCXML implementation when the CCXML session is created and are read-only to
the running script and cannot be modified by the CCXML program. Such variables
include session information such as a session identifier, the reason for what
the session was started, list of all Connection objects, and so on. A CCXML
application can determine the reason its session was started by evaluating the
contents of the session.startupmode session variable.

A CCXML session begins with the execution of a CCXML
document. The flow of the execution can be changed with the help of <if>, <elseif>, <else>, <fetch>, and <goto>. Most of a CCXML session’s execution will take place within an <eventprocessor>, which processes a stream of incoming events. A CCXML
session may launch a new CCXML session using <createccxml>. The new CCXML session executes in an independent
context and variable space from the original CCXML session, completely
independent of the lifetime of the original session. Sessions can communicate by sending messages via <send>.

A CCXML session can end in one of the following ways:

  • The CCXML application executes an <exit>.
  • An
    unhandled “error.*” event.
  • An unhandled “ccxml.kill” event.
  • A “ccxml.kill.unconditional” event.

When a CCXML session ends, all active connections, conferences
and dialogs that are owned by that session are automatically terminated by the

A connection is typically shorter than a session. A
session does not end when a connection terminates. A CCXML session does not
necessarily need to have any connections associated with it. After starting, a
session may acquire connections as a result of <createcall> or <move> requests. Figure A and Figure B
illustrate the session life-cycle of several different scenarios.

Figure A

A tutorial on Call Control XML and voice browser call control (1)
Session Life Cycle

Figure B

A tutorial on Call Control XML and voice browser call control (2)
Session Life Cycle, a different

In our “hello world” example (Example1.txt) when a session ends, any
resources, including connections owned by that session, are terminated—as
illustrated in Figure C. These are
not only life-cycle scenarios; sessions can also have multiple sequential
connections, or even have multiple concurrent connections. A connection can be
moved from one CCXML session to another session.

(Video) Register Cisco Phones to Non-Cisco Phone System, Third Party Call Control (3PCC)

Figure C

A tutorial on Call Control XML and voice browser call control (3)

If at anytime the platform wishes to terminate a CCXML
session it must raise a ccxml.kill
event to inform the CCXML application. The normal response to
this event is for the CCXML application to perform a clear up and termination
of current active connections, conferences or dialogs, and then execute
an <exit> element.

Let’s look at a more complex example of running a
simple VoiceXML dialog from CCXML. The application answers an incoming phone
call and then connects it to a VoiceXML dialog that returns a value that is
then logged to the platform (Example2.txt
and Example2.vxml.txt). This is the
first point of connection between CCXML and VoiceXML.

Example 2

<?xml version=”1.0″ encoding=”UTF-8″?>
<ccxml version=”1.0″ xmlns=”http://www.w3.org/2002/09/ccxml”>
<!– Lets declare our state var –>
<var name=”state0″ expr=”‘init'”/>

<!– Process the incoming call –>
<transition state=”init” event=”connection.alerting”>
<!– Call has been answered –>
<transition state=”init” event=”connection.connected” name=”evt”>
<log expr=”‘Houston, we have liftoff.'”/>
<assign name=”state0″ expr=”‘dialogActive'” />

<!– Process the incoming call –>
<transition state=”dialogActive” event=”dialog.exit” name=”evt”>
<log expr=”‘Houston, the dialog returned [‘ + evt.values.input + ‘]'” />
<exit />
<!– Caller hung up. Lets just go on and end the session –>
<transition event=”connection.disconnected” name=”evt”>
<!– Something went wrong. Lets go on and log some info and end the call –>
<transition event=”error.*” name=”evt”>
<log expr=”‘Houston, we have a problem: (‘ + evt.reason + ‘)'”/>

Example 2 – VXML

<?xml version=”1.0″?>
<vxmlxmlns=”http://www.w3.org/2001/vxml” version=”2.0″>
<form id=”Form”>
<field name=”input” type=”digits”>
Please say some numbers …
<exit namelist=”input”/>

When a CCXML document receives a connection.alerting event within
an <eventprocessor>, like in Example 2, the execution
of an <accept> within the <transition> block will cause the underlying platform to signal
the telephony system to connect the call. The CCXML document MAY then
initiate interactive dialog sessions with the incoming caller, or perform other
telephony operations (e.g., place outgoing calls, join calls, etc).

Dialog handling

CCXML does not provide any mechanism for interacting
with callers but relies on separate dialog environments such as VoiceXML.
Whenever interaction with a caller is required a CCXML session can initiate a
separate dialog provided by a VoiceXML capability or some other technology.
After the dialog interaction is complete, an asynchronous event is sent to the
CCXML session which can use any results returned by the dialog environment to decide
what should happen next. All CCXML elements that manipulate dialogs are
asynchronous with control returning immediately to the CCXML session after the
operation is initiated. The CCXML session is notified when the dialog operation
successfully completes, or fails, by an asynchronous event.

A CCXML program initiates a dialog using the <dialogstart> element (See Example 2).
Execution of this element connects a dialog environment to a connection and
instructs it to start interacting with the caller. For some dialog environments
it may take some time to initialize the dialog environment and thus the use of
the <dialogstart> element alone may cause the caller to hear silence,
or “dead air”. To avoid this situation CCXML provides an ability to
ready a dialog environment prior to connecting and starting it, this is done
using the <dialogprepare> element. Any dialog that has been either started with
<dialogstart>, or prepared with <dialogprepare> can be terminated using the <dialogterminate> element. CCXML implementations must support the <dialogprepare>, <dialogstart>, and <dialogterminate> elements though the exact behavior may vary depending
on the dialog environments supported.

If the dialog cannot be started for any reason, an error.dialog.notstarted event is
posted to the event queue of the CCXML session that processed the <dialogstart> request. When the dialog completes, a dialog.exit event is posted to
the event queue of the CCXML session that started it. In our example we process
the dialog data in transition element <transition
state=”dialogActive” event=”dialog.exit”

VoiceXML integration

CCXML and VoiceXML 2.0 need to be able to exchange
events between the browsers. The method of the message passing is up to the
platform but it is assumed that there is some basic capacity in place. Each
running CCXML session has an event queue used to process CCXML events,
independently of VoiceXML event processing by the dialogs created by that CCXML
session. The execution of certain CCXML elements, such as <dialogterminate> and <send> may cause events to be sent to the
VoiceXML browser; similarly, certain VoiceXML elements such as <transfer> will result in the generation of dialog events
delivered to the CCXML session that owns the dialog in question.

(Video) SPOTbuild.

VoiceXML 2.0 provides limited capabilities for
handling asynchronous or unexpected events. Since CCXML is designed around a
robust event processing mechanism, and since the CCXML session manages
connections to the underlying network, processing of asynchronous events—which
may be delivered through externally accessible event I/O processors—typically
occurs primarily within the CCXML application, which can then control the
VoiceXML session as appropriate. The VoiceXML dialog can therefore focus
exclusively on interaction with the user.

When a VoiceXML dialog is bridged to a connection with
an associated call leg, the standard VoiceXML session variables obtain their
values from the call leg. Otherwise, these variables are undefined. VoiceXML
Session variables are updated whenever there is an update to the associated
connection or conference. When a CCXML application processes a <dialogstart> element it starts up a VoiceXML application on the
connection with the URI that is passed in on the <dialogstart> element or to the dialog that was prepared using <dialogprepare> and specified using the prepareddialogid attribute.

Call control

The primary goal of CCXML is to provide call control
throughout the duration of a call. Call control includes handling incoming
calls, placing outgoing calls, bridging (or conferencing) multiple call legs,
and ultimately disconnecting calls. The goals of the CCXML call model are to
focus on relatively simple types of call control and to be sufficiently
abstract so that the call model can be implemented using all major telephony
definitions such as JAIN Call Control (JCC), CSTA, and S.100.

It seems that CCXML has a bright future in telecom
industry, partially because there is a strong demand for a unified application
interface. There exist several similar processing languages, among which are
CPL, CallXML, ECMA-CSTA, TXML and others. You can
find them in Appendix A of the CCXML specification.

A tutorial on Call Control XML and voice browser call control (4)

Daily Tech Insider Newsletter

Stay up to date on the latest in technology with Daily Tech Insider. We bring you news on industry-leading companies, products, and people, as well as highlighted articles, downloads, and top resources. You’ll receive primers on hot tech topics that will help you stay ahead of the game.

(Video) How to Build a Call Center with Python - Full In-Depth Walkthrough

Delivered Weekdays

Sign up today


What is the difference between VXML and CCXML? ›

While both VXML and CCXML tools, standards, and documentation are available in open-source, VXML is slightly different as it enjoys greater open-source support. CCXML is better deployed through a specific contact centre provider like Avaya or Genesys.

What is the full form of CCXML? ›

CCXML is the “Call Control eXtensible Markup Language”. It is an XML based language that can control the setup, monitoring, and tear down of phone calls.

What is VXML used for? ›

VXML is a digital document standard that specifies interactive media and voice dialogs between humans and computers and is used in developing audio and voice response applications.

What does VXML stand for? ›

The Voice eXtensible Markup Language (VoiceXML) is an XML-based markup language for creating distributed voice applications that users can access from any telephone. VoiceXML is an emerging industry standard defined by the VoiceXML Forum, of which IBM is a founding member.

What is call control eXtensible markup language? ›

Call Control eXtensible Markup Language (CCXML, or sometimes referred to as CXML) is an XML-based language that controls the setup, monitoring, and tear down of phone calls. CCXML allows you to use the strength of Web platforms and technologies, to intelligently control calls on and off the telephone network.

What is the full form of CFG Ste? ›

The full from of CFG is Control Flow Graph. A Control Flow Graph (CFG) is the graphical representation of control flow or computation during the execution of programs or applications.

What is VXML gateway? ›

The VXML Gateway hosts the IOS voice browser, the component which interprets VXML pages from either the Unified CVP IVR service or the VXML Server, plays .

What is VoiceXML in mobile application development? ›

Voice XML is an Extensible Markup Language (XML) standard for storing and processing digitized voice, input recognition and defining human and machine voice interaction. Voice XML uses voice as an input to a machine for desired processing, thereby facilitating voice application development.

What is dialog in VXML? ›

Voice Extensible Markup Language (VoiceXML or VXML) for PowerMedia XMS is designed for creating audio-only dialogs that feature synthesized speech, digitized audio, speech recognition, DTMF key input, speech recording, telephony, and mixed initiative conversations.

What is the XML format? ›

What is XML? The Extensible Markup Language (XML) is a simple text-based format for representing structured information: documents, data, configuration, books, transactions, invoices, and much more. It was derived from an older standard format called SGML (ISO 8879), in order to be more suitable for Web use.

What is VXML replacement? ›

The closest replacement to VoiceXML is a deep neural network (DNN)-based speech-to-text, which processes audio input and generates audio output with far greater prowess and sophistication.

Does Asterisk support VXML? ›

Execute a VoiceXML document over Asterisk (Based on the VXI* VoiceXML browser). This asterisk application is renamed voximal since version 14. The application use Asterisk internal API (Prompt / DTMF / Record) and installed applications.

What is XML in simple terms? ›

Extensible Markup Language (XML) lets you define and store data in a shareable manner. XML supports information exchange between computer systems such as websites, databases, and third-party applications.

What is XML control? ›

An XML Control is an XML file located in configurable file folder. When the application starts, the configured folders are searched and each XML file is compiled into a webcontrol. The application watches the file folders for changed files and recompiles any changed XML Controls.

What is CLI in call setting? ›

Caller ID, or CLI, provides the receiving end of a call with the number of the calling phone. CLI is often used to identify the caller or the geographic location from which a call originated.

What is FSS stand for in it? ›

(Fixed Satellite Services) Communications via satellite to stationary terminals such as roof-mounted dishes for TV and Internet. Contrast with MSS. See communications satellite.

What is the full form of SPP in computer? ›

(1) (Scalable Parallel Processor) A multiprocessing computer that can be upgraded by adding more CPUs. (2) (Standard Parallel Port) The Centronics parallel port that was used on the first PCs.

What is VXML Gateway? ›

The VXML Gateway hosts the IOS voice browser, the component which interprets VXML pages from either the Unified CVP IVR service or the VXML Server, plays .

What is Subdialog in VXML? ›

The subdialog is a reusable dialog that allows values to be returned. The subdialog executes in a new execution context with all variables and execution state initialized. Values can be passed into the subdialog using <param> child elements; the subdialog must contain <var> variable declaration for each parameter.

What does a VG224 do? ›

The Cisco VG224 Analog Voice Gateways allow you to use your IP telephony solution with traditional analog devices while taking advantage of the productivity afforded by IP infrastructure.

What is Ucce call flow? ›

The ICM router tries to match the route request DNIS with the dialed number. Once the dialed number is matched, ICM maps it to a call type. Once the call type is matched, it is mapped to a script. If the script is scheduled to run now, the ICM router executes the script.

What is Cucm gateway? ›

A gateway serves a very special purpose in Unified Communications. The Cisco gateway router converts the VoIP RTP (Real Time Protocol) to a TDM (Time Division Multiplexing) format.

What is voximal? ›

Voximal allows you to create voice portals with VoiceXML language to interact with humans through phone calls. It is integrated with Asterisk, and FreePBX core softwares. The core telephony software is Asterisk, the open source PBX. Web GUI is the popular FreePBX, propulsed by LAMP (Linux,Apache,Mysql,PHP).

Is Asterisk a VoIP? ›

Asterisk is the #1 open source communications toolkit.

Asterisk powers IP PBX systems, VoIP gateways, conference servers, and is used by SMBs, enterprises, call centers, carriers and governments worldwide.

Is Asterisk a sip? ›

Asterisk supports several standard voice over IP protocols, including the Session Initiation Protocol (SIP), the Media Gateway Control Protocol (MGCP), and H. 323. Asterisk supports most SIP telephones, acting both as registrar and back-to-back user agent.

What is voice XML in mobile application development? ›

Voice XML is an Extensible Markup Language (XML) standard for storing and processing digitized voice, input recognition and defining human and machine voice interaction. Voice XML uses voice as an input to a machine for desired processing, thereby facilitating voice application development.


1. how Hackers Remotely Control Any phone!? check if your phone is already hacked now!
(Loi Liang Yang)
2. Yeastar S-Series VoIP PBX Configuration Basic Level - Session 4 Inbound Call Control
3. Voice Call with browser using javascript, PHP and Twillio API
(Sulochana Tutorials)
4. Cisco Phone Remote Control Browser Add-on
(The Tech Catalyst)
5. How To Make People Respect You Without Saying A Word
(Charisma on Command)
6. The Speech Recognition Ecosystem
Top Articles
Latest Posts
Article information

Author: Sen. Ignacio Ratke

Last Updated: 01/09/2023

Views: 5756

Rating: 4.6 / 5 (76 voted)

Reviews: 91% of readers found this page helpful

Author information

Name: Sen. Ignacio Ratke

Birthday: 1999-05-27

Address: Apt. 171 8116 Bailey Via, Roberthaven, GA 58289

Phone: +2585395768220

Job: Lead Liaison

Hobby: Lockpicking, LARPing, Lego building, Lapidary, Macrame, Book restoration, Bodybuilding

Introduction: My name is Sen. Ignacio Ratke, I am a adventurous, zealous, outstanding, agreeable, precious, excited, gifted person who loves writing and wants to share my knowledge and understanding with you.