Probably you could implement a Comparator
or extend Collator
that ranks Latin before CJK using a regex like this:
public class LatinBeforeCJKCollator implements Comparator<String> {
private final Collator collator;
public LatinBeforeCJKCollator(Collator collator) {
this.collator = collator;
}
@Override
public int compare(String source, String target) {
if (source.matches("[\\p{IsHiragana}\\p{IsKatakana}\\p{IsHan}]+") && target.matches("\\p{IsLatin}+")) {
return -1;
}
if (source.matches("\\p{IsLatin}+") && target.matches("[\\p{IsHiragana}\\p{IsKatakana}\\p{IsHan}]+")) {
return 1;
}
return collator.compare(source, target);
}
}
I used Unicode character-sets from answer to this question:
How can I detect japanese text in a Java string?
You might need to customize the matching (e.g. all letters are latin, first letter is latin, etc.) after your needs.
When used like this:
final Comparator comparator = new LatinBeforeCJKCollator(Collator.getInstance(Locale.JAPANESE);
List<String> strings = List.of("Alpha", "Beta", "問屋", "家事問屋");
System.out.println(strings.stream().sorted(collator).collect(Collectors.joining(",")));
Then the output would appear sorted like this:
家事問屋,問屋,Alpha,Beta