eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
eSpeak NG is an open-source speech synthesizer designed to provide text-to-speech capabilities for over 100 languages and accents. It supports a wide range of platforms, including Linux, Windows, Android, and other operating systems.
Key Features:
Multi-Language Support: Offers voice synthesis in more than 100 languages, making it accessible to global audiences.
Klatt Formant Synthesis: Uses a method that allows for efficient speech generation with clear output even at high speeds.
MBROLA Backend Compatibility: Enables the use of diphone voices from MBROLA for enhanced naturalness in certain applications.
SSML and HTML Support: Parses Speech Synthesis Markup Language (SSML) and HTML to provide detailed control over voice characteristics.
Compact Size: Delivers high-quality speech synthesis with minimal resource usage, making it ideal for constrained environments.
Voice Customization: Allows users to adjust voice characteristics such as pitch, speed, and intonation to suit specific needs.
SAPI5 Integration: Provides a Windows SAPI5 interface, enabling integration with screen readers and other accessibility tools.
Cross-Platform Compatibility: Available as a command-line tool, shared library, or DLL, supporting diverse deployment scenarios.
Audience & Benefit:
Ideal for developers building multilingual applications, educators creating accessible learning materials, and organizations requiring voice synthesis in multiple languages. eSpeak NG is particularly beneficial for those seeking an efficient, lightweight solution for text-to-speech needs without the resource demands of human-based synthesizers. It can be easily installed via winget, ensuring seamless integration into development workflows.
The eSpeak NG is a compact open source software text-to-speech synthesizer for
Linux, Windows, Android and other operating systems. It supports
more than 100 languages and accents. It is based on the eSpeak engine
created by Jonathan Duddington.
eSpeak NG uses a "formant synthesis" method. This allows many languages to be
provided in a small size. The speech is clear, and can be used at high speeds,
but is not as natural or smooth as larger synthesizers which are based on human
speech recordings. It also supports Klatt formant synthesis, and the ability
to use MBROLA as backend speech synthesizer.
eSpeak NG is available as:
A command line program (Linux and Windows) to speak text from a file or
from stdin.
A shared library version for use by other programs. (On Windows this is
a DLL).
A SAPI5 version for Windows, so it can be used with screen-readers and
other programs that support the Windows SAPI5 interface.
eSpeak NG has been ported to other platforms, including Solaris and Mac
OSX.
Features
Includes different Voices, whose characteristics can be altered.
Can produce speech output as a WAV file.
SSML (Speech Synthesis Markup Language) is supported (not complete),
and also HTML.
Compact size. The program and its data, including many languages,
totals about few Mbytes.
Can be used as a front-end to MBROLA diphone voices.
eSpeak NG converts text to phonemes with pitch and length information.
Can translate text into phoneme codes, so it could be adapted as a
front end for another speech synthesis engine.
eSpeak NG’s flexibility and broad language support make it a valuable tool for advancing accessibility and innovation in speech technology across various domains.
Potential for other languages. Several are included in varying stages
of progress. Help from native speakers for these or other languages is
welcome.
Written in C.
See the ChangeLog for a description of the changes in the
various releases and with the eSpeak NG project.
The following platforms are supported:
Platform
Minimum Version
Status
Linux
BSD
Android
4.0
Windows
Windows 8
Mac
Documentation
User guide explains how to set up and use eSpeak NG from command line or as a library.
Building guide provides info how to compile and build eSpeak NG from the source.
Index provides full list of more detailed information for contributors and developers.
Look at eSpeak NG roadmap to participate in development of eSpeak NG.
eSpeak Compatibility
The espeak-ng binaries use the same command-line options as espeak, with
several additions to provide new functionality from espeak-ng such as specifying
the output audio device name to use. The build creates symlinks of espeak to
espeak-ng, and speak to speak-ng.
The espeak speak_lib.h include file is located in espeak-ng/speak_lib.h with
an optional symlink in espeak/speak_lib.h. This file contains the espeak 1.48.15
API, with a change to the ESPEAK_API macro to fix building on Windows
and some minor changes to the documentation comments. This C API is API and ABI
compatible with espeak.
The espeak-data data has been moved to espeak-ng-data to avoid conflicts with
espeak. There have been various changes to the voice, dictionary and phoneme files
that make them incompatible with espeak.
The espeak-ng project does not include the espeakedit program. It has moved
the logic to build the dictionary, phoneme and intonation binary files into the
libespeak-ng.so file that is accessible from the espeak-ng command line and
C API.
History
The program was originally known as speak and originally written
for Acorn/RISC_OS computers starting in 1995 by Jonathan Duddington. This was
enhanced and re-written in 2007 as eSpeak, including a relaxation of the
original memory and processing power constraints, and with support for additional
languages.
In 2010, Reece H. Dunn started maintaining a version of eSpeak on GitHub that
was designed to make it easier to build eSpeak on POSIX systems, porting the
build system to autotools in 2012. In late 2015, this project was officially
forked to a new eSpeak NG project. The new eSpeak NG project is a significant
departure from the eSpeak project, with the intention of cleaning up the
existing codebase, adding new features, and adding to and improving the
supported languages.
The historical branch contains the available older releases of the original
eSpeak that are not contained in the subversion repository.
These early releases have been checked into the historical branch,
with the 1.24.02 release as the last entry. This makes it possible
to use the replace functionality of git to see the earlier history:
git replace 8d59235f 63c1c019
NOTE: The source releases contain the big_endian, espeak-edit,
praat-mod, riskos, windows_dll and windows_sapi folders. These
do not appear in the source repository until later releases, so have
been excluded from the historical commits to align them better with
the 1.24.02 source commit.
License Information
eSpeak NG Text-to-Speech is released under the GPL version 3 or
later license.
The getopt.c compatibility implementation for getopt support on Windows is
taken from the NetBSD getopt_long implementation, which is licensed under a
2-clause BSD license.