3

I have array of strings: 15MB,12MB, 1TB,1GB. I want to compare them lexicographically by just following the rule that MB are smaller than GB and TB. So at the end I want to get: 12MB,15MB,1GB,1TB. I found a way to compare the letters:

 final static String ORDER="MGT";

public int compare(String o1, String o2) {
       int pos1 = 0;
       int pos2 = 0;
       for (int i = 0; i < Math.min(o1.length(), o2.length()) && pos1 == pos2; i++) {
          pos1 = ORDER.indexOf(o1.charAt(i));
          pos2 = ORDER.indexOf(o2.charAt(i));
       }

       if (pos1 == pos2 && o1.length() != o2.length()) {
           return o1.length() - o2.length();
       }

       return pos1  - pos2  ;
    }

I'm thinking of splitting the string by numbers and letter but then how can I sort them by their letters "MB.." and then by their numbers. Do I use two comparators or something else?

bfury
  • 99
  • 2
  • 12
  • Take a look at: https://stackoverflow.com/questions/13973503/sorting-strings-that-contains-number-in-java there is an interesting solution there. – Laguh Apr 21 '19 at 21:20
  • Already did but they compare only the numbers not the letters. – bfury Apr 21 '19 at 21:25

3 Answers3

1

it will be much easier to compare if you first convert data to a common unit (e.g. MB). if values are same after this conversion then you should apply lexicographical sorting, it may look like this:

private int convertToMegaBytes(String s) {

    char c = s.charAt(s.length() - 2);

    if(c == 'G')
        return 1024 * Integer.parseInt(s.substring(0, s.length() - 2));
    if(c == 'T')
        return 1024 * 1024 * Integer.parseInt(s.substring(0, s.length() - 2));

    return Integer.parseInt(s.substring(0, s.length() - 2));

}

final static String ORDER = "MGT";

public int compare(String o1, String o2) {
    int v = convertToMegaBytes(o1)  - convertToMegaBytes(o2);
    // if values are equal then compare lexicographically
    return v == 0 ? ORDER.indexOf(o1.charAt(o1.length() - 2)) - ORDER.indexOf(o2.charAt(o2.length() - 2)) : v;
}
guleryuz
  • 2,714
  • 1
  • 15
  • 19
  • Thats very helpful but not what I'm trying to do. I just need to compare the given values not parse them. So for example if I'm given 1GB,1024MB I'll still need them to be sorted like that: 1024MB, 1GB – bfury Apr 21 '19 at 21:24
0

This might do the trick. The compare method gets the number of bytes that each String represents as a long (10KB becomes 10000) and then compares those. The getSizeOfString method turns a String into a long that is representative of the number of bytes that it represents.

  public int compare(String o1, String o2) {
    long size1 = getSizeOfString(o1);
    long size2 = getSizeOfString(o2);
    return Long.compare(size1, size2);
  }

  private long getSizeOfString(String sizeString) {
    Pattern validSizePattern = Pattern.compile("(\\d+)([KMG])B");
    Matcher matcher = validSizePattern.matcher(sizeString);
    matcher.find();
    long size = Long.valueOf(matcher.group(1));

    switch (matcher.group(2)) {
      case "K":
        size *= 1024;
        break;
      case "M":
        size *= (1024 * 1024);
        break;
      case "G":
        size *= (1024 * 1024 * 1024);
        break;
    }
    return size;
  }
cheemcheem
  • 114
  • 6
  • Helpful, but not what I'm trying to do. No need for parsing. If I get 1001MB and 1GB I will still need to sort them in the same order. – bfury Apr 21 '19 at 21:29
  • Try it with those two numbers, it will say 1001MB is more than 1GB. Is that not what you are looking for? – cheemcheem Apr 21 '19 at 21:33
  • The parsing just lets them be compared easier, is there a performance requirement that you are looking for as well? – cheemcheem Apr 21 '19 at 21:34
  • No I want 1001MB to be sorted as less than 1GB – bfury Apr 21 '19 at 22:07
  • But I presume you would want 1025MB to be greater that 1GB. So the order would be 1GB, 1025MB. – WJS Apr 21 '19 at 22:09
  • I changed them to use KiB, MiB, and GiB. So they will now sort like 1023MB < 1GB < 1025MB? – cheemcheem Apr 21 '19 at 22:19
0

This now sorts first on units and then on values within units. This was changed to reflect the last comment by the OP.

import java.util.*;

enum Memory {
   B(1), KB(2), MB(3), GB(4), TB(5);
   public long val;

   private Memory(long val) {
      this.val = val;
   }
}

public class MemorySort {
   public static void main(String[] args) {
      List<String> memory = Arrays.asList("122003B",
            "1TB",
            "2KB",
            "100000MB",
            "1027MB",
            "2024GB");

      Comparator<String> units = Comparator.comparing(
            a -> Memory.valueOf(a.replaceAll("\\d+", "")).val);

      Comparator<String> values = Comparator.comparing(
            a -> Integer.parseInt(a.replaceAll("[A-Z]+", "")));

      Collections.sort(memory, units.thenComparing(values));
      System.out.println(memory);
   }
}


WJS
  • 36,363
  • 4
  • 24
  • 39
  • Very helpful but not what I need at all. Simply put if I have: 1200MB,2MB,1GB I want them to be sorted as: 2MB, 1200MB, 1GB – bfury Apr 22 '19 at 06:22
  • I modified it to reflect the requirement. It still isn't very efficient with the string replacements but it works. – WJS Apr 22 '19 at 12:26