Festival offers a BSD socket-based interface. This allows Festival to run as a server and allow client programs to access it. Basically the server offers a new command interpreter for each client that attaches to it. The server is forked for each client but this is much faster than having to wait for a Festival process to start from scratch. Also the server can run on a bigger machine, offering much faster synthesis.
Note: the Festival server is inherently insecure and may allow arbitrary users access to your machine.
Every effort has been made to minimise the risk of unauthorised access
through Festival and a number of levels of security are provided.
However with any program offering socket access, like httpd
,
sendmail
or ftpd
there is a risk that unauthorised access
is possible. I trust Festival's security enough to often run it on my
own machine and departmental servers, restricting access to within our
department. Please read the information below before using
the Festival server so you understand the risks.
The following access control is available for Festival when running as a server. When the server starts it will usually start by loading in various commands specific for the task it is to be used for. The following variables are used to control access.
server_port
server_log_file
server_deny_list
server_access_list
server_passwd
(set_server_safe_functions FUNCNAMELIST)
--ttw
uses.
(set_server_safe_functions '(tts_return_to_client tts_text tts_textall Parameter.set))
nobody
to limit the access the process will have, also running it
in a chroot environment is more secure.
For example suppose we wish to allow access to all machines in the CSTR
domain except for holmes.cstr.ed.ac.uk
and
adam.cstr.ed.ac.uk
. This may be done by adding the following two
commands to a file e.g. server.scm
(set! server_deny_list '("holmes\\.cstr\\.ed\\.ac\\.uk" "adam\\.cstr\\.ed\\.ac\\.uk")) (set! server_access_list '("[^\\.]*\\.cstr\\.ed\\.ac\\.uk"))
and them running the command
festival PATH_TO/server.scm --server
This is not complete though as when DNS is not working holmes
and
adam
will still be able to access the server (but if our DNS
isn't working we probably have more serious problems). However the
above is secure in that only machines in the domain cstr.ed.ac.uk
can access the server, though there may be ways to fix machines to
identify themselves as being in that domain even when they are not.
By default Festival in server mode will only accept client connections
for localhost
.
An example client program called festival_client is included with the system that provides a wide range of access methods to the server. A number of options for the client are offered.
--server
--port
--output FILENAME
--ttw
option uses this as does the
use of the Festival command utt.send.wave.client
. If
an output waveform file is received by festival_client
and no output file has been given the waveform is discarded with
an error message.
--passwd PASSWD
--ttw
option is used, a passwd is required and
none specified access will be denied.
--prolog FILE
--ttw
which otherwise
does not offer any way to send commands as well as the text to the
server.
--otype OUTPUTTYPE
nist
, but alaw
,
riff
, ulaw
and others as supported by the Edinburgh
Speech Tools Library are valid. You may use raw too but note that
Festival may return waveforms of various sampling rates depending on the
sample rates of the databases its using. You can of course make
Festival only return one particular sample rate, by using
after_synth_hooks
. Note that byte order will be native machine of the
client machine if the output format allows it.
--ttw
festival_client
useful
in many simple applications. Although you can connect to the server
and send arbitrary Festival Scheme commands, this option automatically
does what is probably what you want most often. When specified
this options takes text from the specified file (or stdin),
synthesizes it (in one go) and saves it in the specified output
file. It basically does the following
(Parameter.set 'Wavefiletype '<output type>) (tts_textall " <file/stdin contents> ")))
Note that this is best used for small, single utterance texts as you
have to wait for the whole text to be synthesized before it is returned.
--aucommand COMMAND
FILE
will be set when COMMAND is executed.
--async
--ttw
causes the text to be synthesized utterance by utterance and be sent back
in separated waveforms. Using --aucommand
each waveform my
be played locally, and when festival_client is interrupted
the sound will stop. Getting the client to connect to an audio
server elsewhere means the sound will not necessarily stop when
the festival_client process is stopped.
--withlisp
send_client
. If this option
is specified the Lisp expressions are printed to standard out,
otherwise this information is discarded.
A typical example use of festival_client is
festival_client --async --ttw --aucommand 'na_play $FILE' fred.txt
This will use na_play to play each waveform generated for the
utterances in fred.txt. Note the single quotes so that
the $
in $FILE
isn't expanded locally.
Note the server must be running before you can talk to it. At present Festival is not set up for automatic invocations through inetd and /etc/services. If you do that yourself, note that it is a different type of interface as inetd assumes all communication goes through standard in/out.
Also note that each connection to the server starts a new session. Variables are not persistent over multiple calls to the server so if any initialization is required (e.g. loading of voices) it must be done each time the client starts or more reasonably in the server when it is started.
A PERL festival client is also available in festival/examples/festival_client.pl
The client talks to the server using s-expression (Lisp). The server will reply with a number of different chunks until either OK is returned or ER (on error). The communication is synchronous, each client request can generate a number of waveform (WV) replies and/or Lisp replies (LP) and will be terminated with an OK (or ER). Lisp is used as it has its own inherent syntax that Festival can already parse.
The following pseudo-code will help define the protocol as well as show typical use
fprintf(serverfd,"%s\n",s-expression); do ack = read three character acknowledgemnt if (ack == "WV\n") read a waveform else if (ack == "LP\n") read an s-expression else if (ack == "ER\n") an error occurred, break; while ack != "OK\n"
The server can send a waveform in an utterance to the client through the
function utt.send.wave.client
. The server can send a lisp
expression to the client through the function TO BE DONE.