I have a series of text reports with fields like
"Contractile Front velocity"
on them
Some of them have "Contractile Front velocitycms"
on them instead. There are other terms similar to this where characters like cms have been added.
Each term has a numerical result associated with it and I am trying to put the term and the result into a database. The database field will be (for this example) "Contractile Front velocitycms"
So I would like to convert any report (plain text) field that does not have cms associated with it, to Contractile Front velocitycms
.
Because I have a load of find a replace problems to solve I created a method that uses StringUtils.replaceEach so that I can use a simple colon separated text file as a lookup dictionary to do the find and replace.
public static String FindNReplace(String n) throws IOException{
String [] split = null;
ArrayList<String> orig = new ArrayList<String>();
String [] orig_arr = null;
ArrayList<String> newDoc = new ArrayList<String>();
String [] newDoc_arr = null;
String dictionary="/Users/sebastianzeki/Documents/workspace/PhysiologyUpperGITotalExtractorv2/src/Overview/FindNReplaceDictionary.txt";
BufferedReader br = new BufferedReader(new FileReader(dictionary));
try {
StringBuilder sb = new StringBuilder();
String line = br.readLine();
while (line != null) {
split=line.split(":");
System.out.println(split);
orig.add(split[1]);
newDoc.add(split[0]);
sb.append(line);
sb.append("\n");
line = br.readLine();
}
} finally {
br.close();
}
orig_arr = new String[orig.size()];
orig_arr = orig.toArray(orig_arr);
newDoc_arr = new String[newDoc.size()];
newDoc_arr = newDoc.toArray(newDoc_arr);
String replacer = StringUtils.replaceEach(n, orig_arr, newDoc_arr);
return replacer;
}
The dictionary looks like this
PostPr :Post-Prandial
PostPr :Post-prandial
Nausea :nausea
The problem is that if I just use my dictionary to replace Contractile Front velocity
with Contractile Front velocitycms
then occasionally, where Contractile Front velocitycms already exists I will get Contractile Front velocitycmscms
and the replaceEach
does not use regex. Can anyone think of a solution to avoid me getting the duplicates mentioned