How to sort Apache source archives names (strings). I tried with below code with Jsoup, but not return the expected result given. How to solve this problem?
public static void getApacheArchives() throws IOException{
String url = "https://archive.apache.org/dist/httpd/"; // or whatever goes here
Document document = Jsoup.connect(url).followRedirects(false).timeout(60000/*wait up to 60 sec for response*/).get();
Elements anchors = document.body().getAllElements().select("a");
Collections.sort(anchors, new Comparator<Element>() {
@Override
public int compare(Element e1, Element e2) {
return e1.text().compareTo(e2.text());
}
});
for (int i = 0; i < anchors.size(); i++) {
Element a = anchors.get(i);
if (
( a.text().matches( "(apache_)[1].[0-9].[0-9]{1,2}.(tar.gz)" ) )
||
( a.text().matches( "(httpd-)[0-9]{1,2}.[0-9]{1,2}.[0-9]{1,2}.(tar.gz)") )
){
System.out.println(a.text());
}
}
}
This code return below result:
...
httpd-2.3.6.tar.gz
httpd-2.3.8.tar.gz
httpd-2.4.1.tar.gz
httpd-2.4.10.tar.gz
httpd-2.4.12.tar.gz
httpd-2.4.16.tar.gz
httpd-2.4.17.tar.gz
httpd-2.4.18.tar.gz
httpd-2.4.2.tar.gz
httpd-2.4.20.tar.gz
httpd-2.4.3.tar.gz
httpd-2.4.4.tar.gz
httpd-2.4.6.tar.gz
httpd-2.4.7.tar.gz
httpd-2.4.9.tar.gz
...
But the expected result is below:
...
httpd-2.3.6.tar.gz
httpd-2.3.8.tar.gz
httpd-2.4.1.tar.gz
httpd-2.4.2.tar.gz
httpd-2.4.3.tar.gz
httpd-2.4.4.tar.gz
httpd-2.4.6.tar.gz
httpd-2.4.7.tar.gz
httpd-2.4.9.tar.gz
httpd-2.4.10.tar.gz
httpd-2.4.12.tar.gz
httpd-2.4.16.tar.gz
httpd-2.4.17.tar.gz
httpd-2.4.18.tar.gz
httpd-2.4.20.tar.gz
...