We're in the process of importing our documentation library to SharePoint, and I'm using a java program I wrote to build metadata for these documents. One of the things that I need to do is determine if a document has a cross referenced document. This condition is defined as having the phrase "see " in the document name. However, naming conventions are nonexistent, and all of the following variations exist:
document_see_other_document.doc
document_-_see_other_document.doc
document_(see_other_document).doc
document_[see_other_document].doc
document_{see_other_document}.doc
I have created a variable which defaults as such: String xref = "no cross reference";
I would like to set this String
to "see_other_document"
in cases where there is a see <other document>
substring in the filename.
My plan is to look for an instance of see_
, use that as the start point of a substring, ending with the .
, non-inclusive. But I want to ELIMINATE any special characters that may exist. In my cases above, I would like to return five instances of other_document
, not other_document)
, etc.
My thought was to pull the substring into a variable, then use a regex [^a-zA-Z0-9]
and replace non-alphanumeric characters in that second string variable, but is there a better, more elegant way to skin this cat?
PSEUDOCODE:
if (filename.indexOf("see_">-1) {
String tempFilename = fileName.substring(indexOf("see_")+4,indexOf("."-1));
xref = tempFilename.replaceAll("[^a-zA-Z0-9]","");
} else {
xref;
}