0

I can not sort my string with address I want to have Ąraków Medyczna 1 Kraków Medyczna 2 , Kraków Medyczna 13 . But on first I have :Krakow Medyczna 2 , Kraków Medyczna 13 ,Ąraków Medyczna 1 and second I have : Ąraków Medyczna 1, Kraków Medyczna 13, Kraków Medyczna 2

ArrayList<String> names = new ArrayList<String>();
names.add("Kraków, Medyczna 13");
names.add("Ąraków, Medyczna 1");
names.add("Kraków, Medyczna 2");
Collections.sort(names);
Collections.sort(names, Collator.getInstance(new Locale("PL")));
for(String s : names){
    System.out.println(s);
}
Collections.sort(names, new Comparator<String>() {
    public int compare(String o1, String o2) {

        String o1StringPart = o1.replaceAll("\\d", "");
        String o2StringPart = o2.replaceAll("\\d", "");

        if(o1StringPart.equalsIgnoreCase(o2StringPart))
        {
            return extractInt(o1) - extractInt(o2);
        }
        return o1.compareTo(o2);
    }

    int extractInt(String s) {
        String num = s.replaceAll("\\D", "");
        // return 0 if no digits found
        return num.isEmpty() ? 0 : Integer.parseInt(num);
    }
});

for(String s : names){
    System.out.println(s);    
}
Lauren Rutledge
  • 1,195
  • 5
  • 18
  • 27
test stst
  • 13
  • 2
  • Please spend a bit of time to improve your input. The formatting/indenting of your code is really messed up, and makes it much harder to read it than necessary. You want others to spent their time to help you with your problem, so you please spend the time required to come up with easy-to-read input. – GhostCat Sep 04 '18 at 14:47
  • @GhostCat I format my code – test stst Sep 04 '18 at 14:52
  • 1
    Possible duplicate of [Sort on a string that may contain a number](https://stackoverflow.com/questions/104599/sort-on-a-string-that-may-contain-a-number) – fishinear Sep 04 '18 at 14:52
  • Better. But you might also want to improve the formatting of your example data. – GhostCat Sep 04 '18 at 14:55
  • You're probably after a combination of both approaches. In the first try the collator doesn't seem to work properly (maybe there's no real Polish collator in Java and the standard implementation sees `Ą` as greater than `K`) and in your second try you need to extract and parse the numbers before comparing them. – Thomas Sep 04 '18 at 14:56
  • @Thomas so how it have to look like ? – test stst Sep 04 '18 at 15:01

2 Answers2

2

You want to compare parts consisting entirely of digits (number) and entirely of non-digits (text) part by part.

The comparison below loops over (text, number?).

If only one string starts with a number, it has an empty text as first part, and will be considered smaller.

Collections.sort(names, new Comparator<String>() {
        @Override
        public int compare(String o1, String o2) {
            Pattern digits = Pattern.compile("\\d+");
            Matcher m1 = digits.matcher(o1);
            Matcher m2 = digits.matcher(o2);
            int i1 = 0;
            int i2 = 0;
            while (i1 < o1.length() && i2 < o2.length()) {
                boolean b1 = m1.find();
                int j1 = b1 ? m1.start() : o1.length();
                boolean b2 = m2.find();
                int j2 = b2 ? m2.start() : o2.length();
                String part1 = o1.substring(i1, j1);
                String part2 = o2.substring(i2, j2);
                int cmp = String.compareIgnoreCase(part1, part2);
                if (cmp != 0) {
                    return;
                }
                if (b1 && b2) {
                    int num1 = Integer.parseInt(m1.group());
                    int num2 = Integer.parseInt(m2.group());
                    cmp = Integer.compare(num1, num2);
                    i1 = m1.end();
                    i2 = m2.end();
                } else if (b1) {
                    return -1;
                } else if (b2) {
                    return 1;
                }
            }
            return 0;
        }
    });

In java 8, with a so called lambda:

Collections.sort(names, (o1, o2) -> {
            Pattern digits = Pattern.compile("\\d+");
            Matcher m1 = digits.matcher(o1);
            Matcher m2 = digits.matcher(o2);
            int i1 = 0;
            int i2 = 0;
            while (i1 < o1.length() && i2 < o2.length()) {
                boolean b1 = m1.find();
                int j1 = b1 ? m1.start() : o1.length();
                boolean b2 = m2.find();
                int j2 = b2 ? m2.start() : o2.length();
                String part1 = o1.substring(i1, j1);
                String part2 = o2.substring(i2, j2);
                int cmp = String.compareIgnoreCase(part1, part2);
                if (cmp != 0) {
                    return;
                }
                if (b1 && b2) {
                    int num1 = Integer.parseInt(m1.group());
                    int num2 = Integer.parseInt(m2.group());
                    cmp = Integer.compare(num1, num2);
                    i1 = m1.end();
                    i2 = m2.end();
                } else if (b1) {
                    return -1;
                } else if (b2) {
                    return 1;
                }
            }
            return 0;
        });

This is quite verbose, and there is a "simple" solution since java 9: simply format all numbers to a fixed width, here left-padded with zeroes upto 10 positions.

Collections.sort(names, (o1, o2) ->
    Strings.compareIgnoreCase(
            o1.replaceAll("\\d+", mr -> String.format("%010d", Integer.parseInt(mr.group())),
            o2.replaceAll("\\d+", mr -> String.format("%010d", Integer.parseInt(mr.group())))
    ); 

Since java 9 there is an overloaded String.replaceAll that can be passed a replacing function.

Even a bit more elegant by not repeating one-self:

Function<String, String> numFormatter = s -> s.replaceAll("\\d+",
        mr -> String.format("%010d", Integer.parseInt(mr.group())));
Collections.sort(names, (o1, o2) ->
        Strings.compareIgnoreCase(numFormatter.apply(o1), numFormatter.apply(o2.))
    ); 

And finally there exists a utility function for any conversion, or passing a getter of a field: Comparator.comparing(converter) and Comparator.comparing(converter, otherComparator).

To sort it by your locale/language:

Locale locale = new Locale("pl", "PL");
Collator collator = Collator.getInstance(locale); // How to sort on special letters
Function<String, String> numFormatter = s -> s /*.toUpperCase(locale)*/ .replaceAll("\\d+",
        mr -> String.format("%010d", Integer.parseInt(mr.group())));
Collections.sort(names, Comparator.comparing(numFormatter, collator)); 

The Collator is a Comparator but with built-in sorting for the given language. It behaves better on accented letters. I dropped the case insensitive comparison here, as it might not be needed; otherwise use String.toUpperCase(Locale).

This is a bit much, I am not entirely sure about Android's java, or whether the code compiles (typos), but enjoy.

Joop Eggen
  • 107,315
  • 7
  • 83
  • 138
0

Your custom comparator is almost fine, you just forgot to use the correct comparision for the Polish charset. And "Ą" comes after "K" in normal String comparison.

Change

return o1.compareTo(o2);

to

return Collator.getInstance(new Locale("PL")).compare(o1, o2);
Malte Hartwig
  • 4,477
  • 2
  • 14
  • 30