Overview
I send Strings to a Text-to-Speech server that accepts a maximum length of 300 characters. Due to network latency, there may be a delay between each section of speech being returned, so I'd like to break the speech up at the most 'natural pauses' wherever possible.
Each server request costs me money, so ideally I'd send the longest string possible, up to the maximum allowed characters.
Here is my current implementation:
private static final boolean DEBUG = true;
private static final int MAX_UTTERANCE_LENGTH = 298;
private static final int MIN_UTTERANCE_LENGTH = 200;
private static final String FULL_STOP_SPACE = ". ";
private static final String QUESTION_MARK_SPACE = "? ";
private static final String EXCLAMATION_MARK_SPACE = "! ";
private static final String LINE_SEPARATOR = System.getProperty("line.separator");
private static final String COMMA_SPACE = ", ";
private static final String JUST_A_SPACE = " ";
public static ArrayList<String> splitUtteranceNaturalBreaks(String utterance) {
final long then = System.nanoTime();
final ArrayList<String> speakableUtterances = new ArrayList<String>();
int splitLocation = 0;
String success = null;
while (utterance.length() > MAX_UTTERANCE_LENGTH) {
splitLocation = utterance.lastIndexOf(FULL_STOP_SPACE, MAX_UTTERANCE_LENGTH);
if (DEBUG) {
System.out.println("(0 FULL STOP) - last index at: " + splitLocation);
}
if (splitLocation < MIN_UTTERANCE_LENGTH) {
if (DEBUG) {
System.out.println("(1 FULL STOP) - NOT_OK");
}
splitLocation = utterance.lastIndexOf(QUESTION_MARK_SPACE, MAX_UTTERANCE_LENGTH);
if (DEBUG) {
System.out.println("(1 QUESTION MARK) - last index at: " + splitLocation);
}
if (splitLocation < MIN_UTTERANCE_LENGTH) {
if (DEBUG) {
System.out.println("(2 QUESTION MARK) - NOT_OK");
}
splitLocation = utterance.lastIndexOf(EXCLAMATION_MARK_SPACE, MAX_UTTERANCE_LENGTH);
if (DEBUG) {
System.out.println("(2 EXCLAMATION MARK) - last index at: " + splitLocation);
}
if (splitLocation < MIN_UTTERANCE_LENGTH) {
if (DEBUG) {
System.out.println("(3 EXCLAMATION MARK) - NOT_OK");
}
splitLocation = utterance.lastIndexOf(LINE_SEPARATOR, MAX_UTTERANCE_LENGTH);
if (DEBUG) {
System.out.println("(3 SEPARATOR) - last index at: " + splitLocation);
}
if (splitLocation < MIN_UTTERANCE_LENGTH) {
if (DEBUG) {
System.out.println("(4 SEPARATOR) - NOT_OK");
}
splitLocation = utterance.lastIndexOf(COMMA_SPACE, MAX_UTTERANCE_LENGTH);
if (DEBUG) {
System.out.println("(4 COMMA) - last index at: " + splitLocation);
}
if (splitLocation < MIN_UTTERANCE_LENGTH) {
if (DEBUG) {
System.out.println("(5 COMMA) - NOT_OK");
}
splitLocation = utterance.lastIndexOf(JUST_A_SPACE, MAX_UTTERANCE_LENGTH);
if (DEBUG) {
System.out.println("(5 SPACE) - last index at: " + splitLocation);
}
if (splitLocation < MIN_UTTERANCE_LENGTH) {
if (DEBUG) {
System.out.println("(6 SPACE) - NOT_OK");
}
splitLocation = MAX_UTTERANCE_LENGTH;
if (DEBUG) {
System.out.println("(6 MAX_UTTERANCE_LENGTH) - last index at: " + splitLocation);
}
} else {
if (DEBUG) {
System.out.println("Accepted");
}
splitLocation -= 1;
}
}
} else {
if (DEBUG) {
System.out.println("Accepted");
}
splitLocation -= 1;
}
} else {
if (DEBUG) {
System.out.println("Accepted");
}
}
} else {
if (DEBUG) {
System.out.println("Accepted");
}
}
} else {
if (DEBUG) {
System.out.println("Accepted");
}
}
success = utterance.substring(0, (splitLocation + 2));
speakableUtterances.add(success.trim());
if (DEBUG) {
System.out.println("Split - Length: " + success.length() + " -:- " + success);
System.out.println("------------------------------");
}
utterance = utterance.substring((splitLocation + 2)).trim();
}
speakableUtterances.add(utterance);
if (DEBUG) {
System.out.println("Split - Length: " + utterance.length() + " -:- " + utterance);
final long now = System.nanoTime();
final long elapsed = now - then;
System.out.println("ELAPSED: " + TimeUnit.MILLISECONDS.convert(elapsed, TimeUnit.NANOSECONDS));
}
return speakableUtterances;
}
It's ugly due to being unable to use regex within lastIndexOf
. Ugly aside, it's actually pretty fast.
Problems
Ideally I'd like to use regex that allows for a match on one of my first choice delimiters:
private static final String firstChoice = "[.!?" + LINE_SEPARATOR + "]\\s+";
private static final Pattern pFirstChoice = Pattern.compile(firstChoice);
And then use a matcher to resolve the position:
Matcher matcher = pFirstChoice.matcher(input);
if (matcher.find()) {
splitLocation = matcher.start();
}
My alternative in my current implementation is to store the location of each delimiter and then select the nearest to MAX_UTTERANCE_LENGTH
I've tried various methods to apply the MIN_UTTERANCE_LENGTH
& MAX_UTTERANCE_LENGTH
to the Pattern, so it only captures between these values and using lookarounds to reverse iterate ?<=
, but this is where my knowledge starts to fail me:
private static final String poorEffort = "([.!?]{200, 298})\\s+");
Finally
I wonder if any of you regex masters can achieve what I'm after and confirm if in actual fact, it will prove more efficient?
I thank you in advance.
References: