Installation
************

Release notes
=============

   This documentation covers version 1.0.0 of the Edinburgh Speech
Tools. At the present time, the main reason for making the speech tools
available is for use in other pieces of software such as Festival.  The
Speech Tools have greatly improved from its previous version and
contains code that is of use in its own right.

   Although you are free to use this code in any manner compatible with
the licence, please note that the speech tools are likely to undergo
comprehensive revision in the future and importantly *we do not make
any guarantees about backwards compatibility*.  Thus if you write a
program using the track class for instance, it may not be compatible
with a future release.  However we will, wherever possible, continue to
provide similar functionality in translations to new versions of the
speech tools should always be possible.

   In addition, we warn that while several programs and routines are
quite mature, others, in particular the signal processing and
statistics are young and have not be rigorously tested. Please do not
assume these programs work.

Requirements
============

   In order to compile and install the Edinburgh Speech Tools you need
the following

   * GNU make: Any recent version, the various `make' programs that
     come with different UNIX systems are wildly varying and hence it
     makes it too difficult to write `Makefile's which are portable, so
     we depend on a version of `make' which is available for all of the
     platforms we are aiming at.

   * A C++ compiler: The system was devloped primarily with GNU C++
     version 2.7.2, but we also have compiled it successfully under Sun
     CC.  Note that although previous versions of the speech tools
     required `libg++' this is no longer the case.

     Hopefully we have now sanitized the code sufficiently to to make it
     possible for ports to other C++ compilers without too much
     difficulty.  But please note C++ is not a fully standardized
     language and each compiler follows the incomplete standard to
     various degrees, so often there are many but simple problems when
     porting to new C++ compilers.  We are trying to deal with this by
     increasing our support.  However, it is likely that small changes
     will be required for C++ compilers we have not yet tested the
     system under.

     However we feel this is stable enough to make it worthwhile
     attempting ports to other C++ compilers that we haven't tried yet.

   We have successfully compiled and tested the speech tools on the
following systems:
   * Sun Sparc Solaris 2.5.1: GCC 2.7.2, GCC 2.6.3, SunCC 3.01, SunCC
     4.1

   * Sun Sparc SunOS 4.1.3: GCC 2.7.2

   * Intel Solaris 2.5.1:

   * FreeBSD for Intel 2.1.7 and 2.2.1: GCC 2.7.2, GCC 2.6.3

   * Linux (2.0.30) for Intel (RedHat 4.1): GCC 2.7.2

   * DEC Alpha OSF1 3.2: GCC 2.7.2

   * Windows NT 4.0: GCC 2.7.2-970404 (from Cygnus GNU win32 b18) As
stated before C++ compilers are not standard and it is non-trivial to
find the correct dialect which compiles under all.  We recommend the
use of GCC 2.7.2 if you can use it, it is the most likely one to work.

   Previous versions of the system have successfully compiled under SGI
IRIX 5.3, and HPUX but at time of writing this we have not yet rechecked
this version.

   For our Windows NT and Windows 95 ports we use the Cygnus GNU win32
environment (b18) available from `ftp://ftp.cygnus.com/pub/gnu-win32/'.

   Before installing the speech tools it is worth ensuring you have a
fully installed and working version of your C++ compiler.  Most of the
problems people have had in installing the speech tools have been due to
incomplete or bad compiler installation.  It might be worth checking if
the following program works, if you don't know if anyone has used your
C++ installation before.
     #include <iostream.h>
     int main (int argc, char **argv)
     {
        cout << "Hello world\n";
     }

Building it
===========

Configuration
-------------

   All compile-time configuration for the system is done through the
user definable file `config/config_make_rules'.  You must create this
file before you can compile the library.  An example is given in
`config/config_make_rules-dist', copy it and change its permissions to
give write access
     cd config
     cp config_make_rules-dist config_make_rules
     chmod +w config_make_rules
   You must now edit `config_make_rules' to make it reflect you local
environment.

   This involves selecting your `C++' compiler and indentifying various
library directories.  Please read the comments in the file for specific
instructions.

   Simple choices for common set ups are given near the top of this
file. But for some sub-systems you will also need to change pathnames
for external library support.

   At present one external library may be used.  If you wish, NCD's
network audio system (formerly call netaudio) is supported.

   NCD's NAS library offers network transparent access to audio
hardware on a number of different platforms and is used extensively in
CSTR.  NAS is available from `ftp://ftp.x.org/contrib/audio/nas/' as
well as being distributed in the contrib directory of X11R6.

   If you wish to use it, uncomment the variable `INCLUDE_NAS'.  And
also check the valus of `NADIR' further down the file.

   Other options allow you to specify support of other audio devices.
Note these are operating system and hardware dependent and you may only
select them if you are compiling on that OS type and have the hardware.

   The previously released version of the speech tools supported an
alternative method for reading and writing Entropic's ESPS headered
files.  After testing, our own versions of functions to access these
files seem adequate for all the types of file we wish.  So we have
removed the proprietary access methods (which required the ESPS
libraries and a licence).

Compilation
-----------

   Once you have configured `config/config_make_rules' you can compile
the system.  First build the include dependencies
     gnumake depend
   This outputs what appears are errors about missing files, this is
not a problem.  Now build the library
     gnumake
   Note this must be *GNU make*, which may be called `make' on your
system, or `gmake' or `gnumake'.  This will compile all library
functions and all the executables.  If you wish to only compile the
library itself then use
     gnumake lib_build

   Note that if you compile with `-g' the library and the corresponding
binaries will be large.  Particulary the executables, you will need in
order of 80 megabytes to compile the system, if your C++ libraries are
not compiled as shared libraries.  If you compile without `-g' the
whole library directory is about 12 megabytes on Linux (which has
shared libraries for `libstdc++' or about 26 megabytes of Sparc Solaris
(which does not have a shared library `libstdc++' by default).  This is
almost entirely due to the size of the executables.  C++ does not make
small binaries.

   It should be possible to build a shared library for both the system
C++ libraries and the speech tools library itself.  This would
substantially reduce the footprint of the executables.  Shared
libraries are different on every machine and we have not spent time
investigating support of the different architectures that we support.
We'll have to address this in later versions.

Installing the system
---------------------

   All executables are linked to from `speech_tools/bin' and you should
add that to your PATH in order to use them.

Compatibility
=============

   Note that the previously widely released version of the speech tools
(0.96.1) required GNU's `libg++'.  As we have written our own largely
compatible string class, called `EST_String', we no longer require
GNU's `libg++' but do still require GNU's `libstdc++' which provides
i/o streams.  (In older versions of GCC you still require to link with
`libg++' as there is no split between the GNU G++ library and the
standard C++ library).  Our string class uses references counters
rather than copying which makes it faster for the sort of task we use
it for.

   Specifically all major classes are not prefixed with `EST_' to
reduced the possibility of clashing with symbols in other libraries.
Thus to update code, typically the main change is to add this prefix
onto the symbols `String, Regex, Wave, Utterance, Stream, KVL, TList,
Ngrammar, Option, Stream_Item, TMatrix, TVector, Token, and
TokenStream'.

   `TFR' has been removed and its functionality integrated into
`EST_Track'.

   Note although we apologise for making such changes they were
necessary for our library to be useful in the long run.  Changes can be
typically be made automatically (we did so for the speech tools and
Festival).  We hope not to have to change names so drastically again.

   There are also a number of other member function name changes to make
the system more logical.  We know we have not yet been thorough enough
to make the member functions fully logical but they are now much better.
Thus on converting code there may be other minor changes required.  If
these are not immediately obvious to you please contact us for advice.

Future releases
===============

   In the future we hope to provide the following:

   * Better Classes: New classes (e.g a Frame class) will be added and
     the interface to the existing ones will be tidied up.

   * Signal Processing: Only minimal signal processing is provided at
     present. We hope to at least provide a set of all the basic speech
     signal processing functions.

   * Speech recognition We have some classses already but the inclusion
     of an HMM class, and various decoders are already in development.

   * Statistics: Likewise, only a rough implementation of some routines
     are available at present.

   * User interface: Some functions in the library are intended for
     public use, others are for internal use only. This distinction
     isn't particularly clear. In the future this distinction will be
     made explicit, giving a clear indication which functions should be
     called and which should be avoided.

   * Documentation: Documentation can always be better and the speech
     tools are no exception.

