Need text to speech and speech recognition tools for Linux

Question

I'm planning on writing a program for Linux that uses text to speech and speech recognition. What are the best tools/libraries for this? Should I use Windows instead to be able to use better tools? The tools need to be easily callable from a console or C program.

score 6 · Accepted Answer · answered May 18 '09 at 13:53

6

For speech recognition there are the various Sphinxes. The different variants have different pros and cons, there is a comparison here Comparison of Sphinx versions. Sphinx 4 is Java, but the others are C, I believe.

answered May 18 '09 at 13:53

Matt G

2,373
1
14
12

joeforker · Answer 2 · 2010-06-09T13:06:50.107

It depends quite a bit on what speech you are trying to recognize.

This is an article from 2005 that explains some of the difficulties in creating a dictation program: http://www.cs.cmu.edu/~archan/personal/whyNoOpenSourceDictationDraft4.html . If you want that, the Julius speech recognition engine seems promising, but you will need to add your own acoustic and language models. You might be able to use the voxforge acoustic model.

If you are not trying to write a dictation program then you have a much easier task. Command programs have limited vocabularies, for example 'If you would like to continue in English, say "English"'.

I was able to get pretty good results using pocketsphinx and gstreamer to make a program that automatically edits most occurrences of the word "twitter" out of the TWiT podcast. It didn't work at all until I used my own language model based on transcripts of the podcast; the machine transcriptions from the speech recognizer are useless/hilarious but they do an okay job of finding the keyword.

do you have any experience with using pocketsphinx and gstreamer with the tcpserversrc/client? — si28719e, Aug 26 '09 at 14:53
no, but gstpocketsphinx + tcpserversrc/sink shouldn't be any different than any other gstreamer element + tcpserversrc/sink. — joeforker, Aug 26 '09 at 15:06
the link to your "twitterkiller" program appears to be broken. — Steven Oxley, Jun 08 '10 at 22:45

score 4 · Answer 3 · answered May 18 '09 at 13:46

4

For speech recognition there exists very little for linux. I were only aware of one apparently decent option, something IBM released some years ago but later was no longer made available (anyone knows if this ViaVoice SDK is still possible to get hold of from anywhere?). There are some more information about possible options at wikipedia.

answered May 18 '09 at 13:46

hlovdal

26,565
10
94
165

1

ViaVoice SDK. It was never in full release, and the docs demanded a fairly narrow range on 2.4 kernel release numbers. I toyed around with it for taking some typing load off when I had intermittent tedonitis in my wrists, but no luck... – dmckee --- ex-moderator kitten May 18 '09 at 14:53

mysomic · Answer 4 · 2009-05-18T18:52:39.303

3

I have used both Loquendo and Festival under linux. I would consider the festival voices I have used pretty poor, with very robotic synthesis. The Loquendo voices, on the other hand, are excellent - very high quality.

edited May 18 '09 at 18:52

answered May 18 '09 at 13:15

mysomic

1,546
3
21
34

If you are going to use Festival, you should install the alternate voices. Instructions (for debian/ubuntu) are here: http://ubuntuforums.org/showthread.php?t=677277 – Matt G May 18 '09 at 13:57
How was your experience with Loquendo? If you're up for it, I'd love to ask you a couple questions about it by email? – philfreo Jul 17 '10 at 07:54

score 0 · Answer 5 · answered Aug 13 '14 at 13:24

0

For Debian/Ubuntu text-to-speech there is also SVOX Pico:

sudo apt-get install libttspico-utils

answered Aug 13 '14 at 13:24

Ikem Krueger

186
1
1
12

score -1 · Answer 6 · answered Aug 26 '09 at 14:55

-1

at&t fsm toolkit is also pretty awesome - no commercial use allowed though,

http://www.research.att.com/~fsmtools/fsm/

answered Aug 26 '09 at 14:55

si28719e

2,135
5
20
22

score -1 · Answer 7 · answered Mar 31 '14 at 12:22

-1

Did you checked the HMM-based speech synthesis for text-to-speech. You can find the free demo on the website http://hts.sp.nitech.ac.jp/. Installation will be little tedious.

answered Mar 31 '14 at 12:22

se7en

1

score -1 · Answer 8 · answered Jul 11 '10 at 22:24

-1

This is a bit old but I saw that a fairly comprehensive guide on speech recognition on Hackaday a few days ago: http://hackaday.com/2010/07/09/get-started-with-speech-recognition/

answered Jul 11 '10 at 22:24

Cory Walker

4,809
4
28
32

score -1 · Answer 9 · answered Jan 04 '11 at 13:34

-1

http://simon-listens.org/ - simon open-source speech / voice recognition program

answered Jan 04 '11 at 13:34

Grzegorz Wierzowiecki

10,545
9
50
88

score -1 · Answer 10 · answered Jan 04 '11 at 13:40

-1

And then there is mbrola for text to speech.

answered Jan 04 '11 at 13:40

user562374

3,817
1
22
19

score -1 · Answer 11 · answered May 18 '09 at 12:35

-1

I know espeak is a very good text-to-speech program for linux (it can even do different accents!), but I don't know of any speech recognition systems designed for UNIX.

answered May 18 '09 at 12:35

Rob Golding

3,502
5
26
28

score -2 · Answer 12 · answered Mar 24 '11 at 00:30

The original question was about finding suitable libraries, I know, but from as far as using speech recognition good enough for real dictation, there seems to be nothing out there for Linux (though I am sure it will change in time, I suspect it will take a while,as I am not sure that many people are interested).

At the moment I am trying to promote Dragon NaturallySpeaking as a supported product by CodeWeavers ... so if you are interested as a user it would help if you would cast a vote ...

http://www.codeweavers.com/compatibility/browse/name/?app_id=8427

Need text to speech and speech recognition tools for Linux

12 Answers12

Linked