I am using Java 1.4
as my client requirement as well as lucene-core-2.9.2.jar
and lucene-demos-2.9.2.jar
. I am using Ant
to build. It works fine for all directory except Unicode
and scandic char
.
When I try to listing using listFiles()
, it lists all but unicoded
data shows as block. When it wants to read the list using isDirectory()
, it can not define those folder name for indexing which are other languages(containing unicode
or scandic char
).
How can i solve this problem for using unicoded data and scandic char?
If I use Java 6 or 7,It works well.So as per client need(Java 1.4), please don't tell me to use java 5,6 or 7. Give other valuable answers. As your best understanding, I added my code below
public void addIntoIndex(File dir, IndexWriter indexWriter) {
try {
System.out.println("Now in addIntoIndex");
File[] htmls = dir.listFiles();
/** "Release_Notes" folder will be excluded for indexing */
if(dir.getName().equals("Release_Notes") && this.searchOption.equals("systemHelp")) {
System.out.println("'Release_Notes' folder will be excluded for indexing.");
return;
}
for(int i = 0; i < htmls.length; i++){
String htmlPath = htmls[i].getAbsolutePath();
if(htmls[i].isDirectory()) {
addIntoIndex(new File(htmls[i].getAbsolutePath()), indexWriter);
}
if(htmlPath.endsWith(".html") || htmlPath.endsWith(".htm")){
addDocument(htmlPath, indexWriter);
}
}
} catch (Exception e) {
e.printStackTrace();
}
}