I ran the sample code in the readme file at tryolabs/TLSphinx README.md, and the result of the text property of the Hypothesis is whitespace, while the score property is a negative number of -4420.
Why am I not getting good results in the text property of the Hypothesis?
Here is my code:
let hmm = localDocumentsURL.path // Path to the acustic model
let lm = localDocumentsURL.appendingPathComponent("6844").appendingPathExtension("lm").path // Path to the languaje model
let dict = localDocumentsURL.appendingPathComponent("cmudict-en-us").appendingPathExtension("dict").path // Path to the languaje dictionary
if let config = Config(args: ("-hmm", hmm), ("-lm", lm), ("-dict", dict)) {
if let decoder: TLSphinx.Decoder = TLSphinx.Decoder(config:config) {
let audioFile = Bundle.main.path(forResource: "audio16000", ofType: "wav")! // Path to an audio file
do {
try decoder.decodeSpeech(atPath: audioFile) {
if let hyp: Hypothesis = $0 {
// Print the decoder text and score
print("Text: \(hyp.text) - Score: \(hyp.score)")
} else {
// Can't decode any speech because of an error
}
}
} catch {
print(error)
}
} else {
// Handle Decoder() fail
print("Decoder fail")
}
} else {
// Handle Config() fail
print("Config fail")
}
The debug window had more characters in the text than stackoverflow allowed, so I don't show it.
I am still getting the same result as when I use an mp3 file, except when I used the mp3 file, I got an empty string rather than whitespace. I used Audacity to convert my mp3 file to wav at 16000 Hz sample rate, signed 16 bit PCM format, 16 bit depth, and mono audio channel. Those are the required specifications.