I am trying to compare 2 files out of which, one is plain text(non-english) and other is glossary in key value pair. They look similar to this:
Japanese Text file:
わたしのなまえはしんです。
ソフトウェアインギネアとしてはたらいています.
En-Jp properties file:
as:と
software:ソフトウェア
me:わたしを
name:なまえ
I:わたしは
working:はたらいています。
...
I am trying to compare these 2 files content wise with below code:
Scanner kb = new Scanner(System.in);
String localtext;
String glossarytext;
File dictionary = new File("./src/main/resources/ZN_EN_Test.txt");
Scanner dictScanner = new Scanner(dictionary);
File list = new File("./src/main/resources/ZN_JP_Test.txt");
try
{
while(dictScanner.hasNextLine()){
glossarytext=dictScanner.nextLine();
try (Scanner listScanner = new Scanner(list);){
while(listScanner.hasNextLine()){
localtext=listScanner.nextLine();
if(glossarytext.contains(localtext))
System.out.println(localtext);
}
}
}
} catch(NoSuchElementException e) {
e.printStackTrace();
}
Problem here is, since the Japanese text do not have space in between 2 words, scanner seems to be failing to pass contains
condition. The same program runs successfully if I arrange words something like below:
わたしの
なまえ
は
しん
です。
How should I make it work to find the matching contents without formatting Japanese text file.