3

i got a task to sort text file with some requirements:

  1. sort rows according to columns
  2. sort rows by 1st columns (if data in columns are the same then by second column) and etc. data in rows remains unchanged after sort;
  3. numbers must be sorted in ascending order, and letters in alphabetical order, number goes higher than letter
  4. columns separated with tab ("\t")

what i have done: read file and copy everything in to List>, where every element of List is line from file stored in List here is the code:

public class ReadDataFile {
public static List<List<String>> readData(String fileName) throws IOException {
    BufferedReader br = new BufferedReader(new FileReader(fileName + ".txt"));
    List<List<String>> data = new ArrayList<List<String>>();
    String line;
    while (true) {
        line = br.readLine();
        if (line == null)
            break;
        List<String>lines = Arrays.asList(line.split("\t"));
        data.add(lines);
        System.out.println(lines);
    }
    br.close();
    return data;

and writes data to another file:

    public void writeToFile(String fileName) throws IOException {
    FileWriter writer = new FileWriter(fileName);
    List<List<String>> data = ReadDataFile.readData("input");

    Collections.sort(data, new Comparator<List<String>>() {
        @Override
        public int compare(List<String> o1, List<String> o2) {
            // TODO Auto-generated method stub
            return o1.get(0).compareTo(o2.get(0));
        }
    });

    for (List<String> lines : data) {
        for (int i = 0; i < lines.size(); i++) {
            writer.write(lines.get(i));
            if (i < lines.size() - 1) {
                writer.write("\t");
            }
        }
        writer.write("\n");

    }
    writer.close();
}

the problem is that:

public int compare(List<String> o1, List<String> o2) {
  // TODO Auto-generated method stub
  return o1.get(0).compareTo(o2.get(0));
}

doesn`t sort correctly what i need.

there is example of input file:

-2.2 2 3 4 329 2
2.2 12345q 69 -afg
2.2 12345q 69 -asdf
-22 1234234 asdfasf asdgas
-22 11 abc
-22 -3 4
-1.1
qqqq 1.1

end expected output is:

-22 -3 4
-22 11 abc
-22 1234234 asdfasf asdgas
-2.2 2 3 4 329 2
-1.1
 2.2 12345q 69 -afg
 2.2 12345q 69 -asdf
 qqqq 1.1

but, what i get is:

-1.1
-2.2 2 3 4 329 2
-22 -3 4
-22 11 abc
-22 1234234 asdfasf asdgas
 2.2 12345q 69 -afg
 2.2 12345q 69 -asdf
 qqqq 1.1

the question is, how to write a proper sort? Thanks for the answers

Stonecold
  • 33
  • 3
  • 1
    Your problem isn't sorting lists but comparing strings that contain numbers and expecting the comparison to behave as if they were numbers, i.e. when comparing strings `"2"` is greater than `"10"` because the characters are compared and `'2'` is greater than `'1'` (likewise `"-22"` is greater than `"-2.2"` because `'2'` and `'.'` are compared) . You'd have to parse the strings to get a number comparison instead (and check if they are numbers at all). – Thomas Jun 01 '17 at 16:02
  • you can use Collections.sort,please see this link https://stackoverflow.com/questions/6957631/sort-java-collection – David Hackro Jun 01 '17 at 16:04
  • @DavidHackro he's already using that: `Collections.sort(data, new Comparator>() { ... }`. – Thomas Jun 01 '17 at 16:06
  • An `S` is *higher* than a `C`, so it sorts *later*. Saying that *"number goes higher than letter"* mean that number sorts *after* letter, but your example is the opposite. Please clarify language. Also, is letter sorting case-sensitive? How about accented letters? Do you want to sort according to some language, e.g. in German, `ü` sorts with `u`? – Andreas Jun 01 '17 at 16:20
  • You are not fully telling us how you want the sorting to work. By looking at your expected output you are expecting string values that are numeric to be sorted as numeric and not as strings. If that's the case then in you comparator you need to check if string is nemeric, parse it to Number and compare them as numbers. Secondly you don't have a check if compareTo returns 0, meaning that the first column values are same and then return second column compare to. – tsolakp Jun 01 '17 at 16:35
  • By number goes higher i mean that it prints number before letters and is case insensitive and is common english alphabet – Stonecold Jun 01 '17 at 16:39
  • Sorry @Thomas exactly what is your condition in the comparation the list? maybe size? – David Hackro Jun 01 '17 at 16:46
  • @Stonecold - you need more logic which compare (number1 and number2) or (number1 and String1) or (String1 and String2) – Minh Kieu Jun 01 '17 at 16:48

1 Answers1

2

Seems you want string values that are valid numbers to be sorted using number comparison. Since your example contains non-integer values, you can choose to do number comparisons using double or BigDecimal. Below code uses BigDecimal so numbers of any size can be compared, without loss of precision, but it doesn't support the special values for "Infinite", "-Infinite", and "NaN", or the HexFloatingPointLiteral format that Double.parseDouble() supports.

Comparing a number to a string should sort number before string.

For comparing string vs. string, you can sort lexicographically, case-insensitively, or using a Collator for locale-sensitive comparisons. Below code uses a Collator for the default locale.

Comparison will compare first value of list, and if equal will compared second value, and so forth. If one list is shorter, and lists are equal up to that point, the shorter list sorts first.

public final class NumberStringComparator implements Comparator<List<String>> {
    private Collator collator = Collator.getInstance();
    @Override
    public int compare(List<String> r1, List<String> r2) {
        for (int i = 0; ; i++) {
            if (i == r1.size())
                return (i == r2.size() ? 0 : -1);
            if (i == r2.size())
                return 1;
            String v1 = r1.get(i), v2 = r2.get(i);
            BigDecimal n1 = null, n2 = null;
            try { n1 = new BigDecimal(v1); } catch (@SuppressWarnings("unused") NumberFormatException unused) {/**/}
            try { n2 = new BigDecimal(v2); } catch (@SuppressWarnings("unused") NumberFormatException unused) {/**/}
            int cmp = (n1 == null ? (n2 == null ? this.collator.compare(v1, v2) : 1) : (n2 == null ? -1 : n1.compareTo(n2)));
            if (cmp != 0)
                return cmp;
        }
    }
}

Test

String input = "-2.2\t2\t3\t4\t329\t2\n" +
               "2.2\t12345q\t69\t-afg\n" +
               "2.2\t12345q\t69\t-asdf\n" +
               "-22\t1234234\tasdfasf\tasdgas\n" +
               "-22\t11\tabc\n" +
               "-22\t-3\t4\n" +
               "-1.1\n" +
               "qqqq\t1.1";
List<List<String>> data = new ArrayList<>();
try (BufferedReader in = new BufferedReader(new StringReader(input))) {
    for (String line; (line = in.readLine()) != null; )
        data.add(Arrays.asList(line.split("\t")));
}
data.sort(new NumberStringComparator());
data.forEach(System.out::println);

Output

[-22, -3, 4]
[-22, 11, abc]
[-22, 1234234, asdfasf, asdgas]
[-2.2, 2, 3, 4, 329, 2]
[-1.1]
[2.2, 12345q, 69, -afg]
[2.2, 12345q, 69, -asdf]
[qqqq, 1.1]
Andreas
  • 154,647
  • 11
  • 152
  • 247
  • Nice answer. Great mix of explanations and working code. – GhostCat Jun 01 '17 at 18:01
  • Thanks this is sort logic i actually needed, and your code if working great, but there is small problem if i try to use it with data read from file it still sorts like this: `-1.1 -2.2 2 3 4 329 2 -22 -3 4 -22 11 abc -22 1234234 asdfasf asdgas 2.2 12345q 69 -afg 2.2 12345q 69 -asdf qqqq 1.1 ` – Stonecold Jun 01 '17 at 23:49
  • @Stonecold Can't see what you meant by that comment, but perhaps you text is not tab-separated? If the text is space-separated, then entire row is one value, which is not a number, and you basically just do a plain row sort, not a sort by (numeric) columns. – Andreas Jun 02 '17 at 00:07
  • maybe in comments it doesn`t show correct, but it`s tab separated but sort result is the same as i described in my first post (if i use data from file, if i`m using your example everything is fine with sorting and printing, but issue is that i need to sort data from file – Stonecold Jun 02 '17 at 00:28
  • @Stonecold I don't know how the content of your file differs from the content of the `input` string given in the example, so it's impossible to say what you're doing wrong. But, even your question text doesn't have tabs between values, it has spaces, so check you text file again. – Andreas Jun 02 '17 at 00:37
  • @Andreas thank you for your help, issue was that my file was space separated, after editing it to tab separated everything works just great – Stonecold Jun 02 '17 at 08:47