1

So, I have a method which takes List of Strings as an arguments and, reads it. Then sorts them by frequency, and if the words have the same frequency they are printed alphabetically. (take in fact that there are also Russian words, and they always go beneath English words).

Here is an example of a good output:

лицами-18
Apex-15
azet-15
xder-15
анатолю-15
андреевич-15
батальона-15
hello-13
zello-13
полноте-13

And Here is my Code:

public class Words {

public String countWords(List<String> lines) {

    StringBuilder input = new StringBuilder();
    StringBuilder answer = new StringBuilder();

    for (String line : lines){
        if(line.length() > 3){
            if(line.substring(line.length() - 1).matches("[.?!,]+")){
                input.append(line.substring(0,line.length()-1)).append(" ");
            }else{
                input.append(line).append(" ");
            }
        }
    }

    String[] strings = input.toString().split("\\s");

    List<String> list = new ArrayList<>(Arrays.asList(strings));

    Map<String, Integer> unsortMap = new HashMap<>();
    while (list.size() != 0){
        String word = list.get(0);
        int freq = Collections.frequency(list, word);
        if (word.length() >= 4 && freq >= 10){
            unsortMap.put(word.toLowerCase(), freq);
        }

        list.removeAll(Collections.singleton(word));
    }
    //The Stream logic is here
    List<String> sortedEntries = unsortMap.entrySet().stream()
            .sorted(Comparator.comparingLong(Map.Entry<String, Integer>::getValue)
                    .reversed()
                    .thenComparing(Map.Entry::getKey)
            )
            .map(it -> it.getKey() + " - " + it.getValue())
            .collect(Collectors.toList());
    
    //Logic ends here

    for (int i = 0; i < sortedEntries.size(); i++) {
        if(i<sortedEntries.size()-1) {
            answer.append(sortedEntries.get(i)).append("\n");
        }
        else{
            answer.append(sortedEntries.get(i));
        }
    }

    return answer.toString();

 }
}

My Issue: Currently the code is working fine, and it gives successful results, however as you can see I am using streams to sort the strings. However, I am just interested if there is other solution to write my code without using streams. To be more precise is there any other way to sort Strings by frequency and then by alphabetic order (if they have same frequency), without using streams.

1 Answers1

1

Anything you can do in streams you can do in conventional Java. But using streams usually makes for much shorter, simpler, and easier-to-read code!

By the way, the first half of your code could be replaced with simply this:

Map < String, AtomicInteger > map = new HashMap <>();
for ( String word : words ) {
    map.putIfAbsent( word , new AtomicInteger( 0 ) );
    map.get( word ).incrementAndGet();
}

The second half of your code is reporting on a map by sorting first on value, then on key.

That challenge is discussed in Questions, Sorting a HashMap based on Value then Key? and Sort a Map<Key, Value> by values. There are some clever solutions among those Answers, such as this one by Sean.

But I would rather keep things simple. I would translate the map of our word and word-count to objects of our own custom class, each object holding the word and word-count as fields.

Java 16+ brings the records feature, making such a custom class definition much easier. A record is a briefer way to write a class whose main purpose is to communicate data transparently and immutably. The compiler implicitly creates the constructor, getters, equals & hashCode, and toString.

record WordAndCount (String word , int count ) {}

Before Java 16, use a conventional class in place of that record. Here is the 33-line source-code equivalent of that record one-liner.

final class WordAndCount {
    private final String word;
    private final int count;

    WordAndCount ( String word , int count ) {
        this.word = word;
        this.count = count;
    }

    public String word () { return word; }

    public int count () { return count; }

    @Override
    public boolean equals ( Object obj ) {
        if ( obj == this ) return true;
        if ( obj == null || obj.getClass() != this.getClass() ) return false;
        var that = ( WordAndCount ) obj;
        return Objects.equals( this.word , that.word ) && this.count == that.count;
    }

    @Override
    public int hashCode () {
        return Objects.hash( word , count );
    }

    @Override
    public String toString () {
        return "WordAndCount[" + "word=" + word + ", " + "count=" + count + ']';
    }
}

We make an array of objects of that record type, and populate.

List<WordAndCount> wordAndCounts = new ArrayList <>(map.size()) ;
for ( String word : map.keySet() ) {
    wordAndCounts.add( new WordAndCount( word, map.get( word ).get() ) );
}

Now sort. The Comparator interface has some handy factory methods where we can pass a method reference.

wordAndCounts.sort(
        Comparator
                .comparingInt( WordAndCount ::count )
                .reversed()
                .thenComparing( WordAndCount ::word )
);

Let’s pull all that code together.

package work.basil.text;

import java.util.*;
import java.util.concurrent.atomic.AtomicInteger;

public class EngRus {
    public static void main ( String[] args ) {
        // Populate input data.
        List < String > words = EngRus.generateText(); // Recreate the original data seen in the Question.
        System.out.println( "words = " + words );

        // Count words in the input list.
        Map < String, AtomicInteger > map = new HashMap <>();
        for ( String word : words ) {
            map.putIfAbsent( word , new AtomicInteger( 0 ) );
            map.get( word ).incrementAndGet();
        }
        System.out.println( "map = " + map );

        // Report on word count, sorting first by word-count numerically and then by word alphabetically.
        record WordAndCount( String word , int count ) { }
        List < WordAndCount > wordAndCounts = new ArrayList <>( map.size() );
        for ( String word : map.keySet() ) {
            wordAndCounts.add( new WordAndCount( word , map.get( word ).get() ) );
        }
        wordAndCounts.sort( Comparator.comparingInt( WordAndCount :: count ).reversed().thenComparing( WordAndCount :: word ) );
        System.out.println( "wordAndCounts = " + wordAndCounts );
    }

    public static List < String > generateText () {
        String input = """
                лицами-18
                Apex-15
                azet-15
                xder-15
                анатолю-15
                андреевич-15
                батальона-15
                hello-13
                zello-13
                полноте-13
                """;

        List < String > words = new ArrayList <>();
        input.lines().forEach( line -> {
            String[] parts = line.split( "-" );
            for ( int i = 0 ; i < Integer.parseInt( parts[ 1 ] ) ; i++ ) {
                words.add( parts[ 0 ] );
            }
        } );
        Collections.shuffle( words );
        return words;
    }
}

When run:

words = [андреевич, hello, xder, батальона, лицами, полноте, анатолю, лицами, полноте, полноте, анатолю, анатолю, zello, hello, лицами, xder, батальона, Apex, xder, андреевич, анатолю, hello, xder, Apex, xder, андреевич, лицами, zello, полноте, лицами, Apex, батальона, zello, полноте, xder, hello, azet, батальона, zello, hello, полноте, Apex, полноте, полноте, azet, андреевич, полноте, Apex, анатолю, hello, azet, лицами, анатолю, zello, анатолю, Apex, zello, андреевич, лицами, xder, hello, полноте, zello, Apex, батальона, лицами, hello, azet, Apex, анатолю, анатолю, zello, полноте, анатолю, Apex, батальона, андреевич, лицами, андреевич, azet, azet, лицами, лицами, zello, azet, анатолю, xder, батальона, полноте, лицами, hello, лицами, xder, xder, лицами, zello, андреевич, батальона, лицами, андреевич, azet, полноте, hello, андреевич, лицами, hello, Apex, батальона, hello, azet, лицами, zello, батальона, анатолю, Apex, azet, xder, андреевич, андреевич, батальона, анатолю, батальона, Apex, xder, azet, azet, xder, azet, анатолю, Apex, батальона, Apex, Apex, лицами, батальона, xder, батальона, hello, андреевич, андреевич, azet, zello, андреевич, xder, azet, анатолю, zello]

map = {андреевич=15, xder=15, zello=13, батальона=15, azet=15, лицами=18, анатолю=15, hello=13, Apex=15, полноте=13}

wordAndCounts = [WordAndCount[word=лицами, count=18], WordAndCount[word=Apex, count=15], WordAndCount[word=azet, count=15], WordAndCount[word=xder, count=15], WordAndCount[word=анатолю, count=15], WordAndCount[word=андреевич, count=15], WordAndCount[word=батальона, count=15], WordAndCount[word=hello, count=13], WordAndCount[word=zello, count=13], WordAndCount[word=полноте, count=13]]

Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154
  • Intellij idea is unable to see records it says, that cannot resolve symbol record –  Aug 05 '21 at 08:05
  • @Baron As I said, and as the Java JEP 395 I linked says, Java 16 and later. – Basil Bourque Aug 05 '21 at 08:07
  • Yes, but i have java 16 installed –  Aug 05 '21 at 08:08
  • @Baron Records are a finalized feature in Java 16. Presumably you have not configured your project to compile for Java 16. If using Maven, check your POM for the `` (or older `` & ``). And then check your IntelliJ settings. Unfortunately there are a mess of such settings in IntelliJ. You must track down each one in a labyrinth of dialog boxes. [Search Stack Overflow](https://duckduckgo.com/?q=%2Bintellij+settings+java+version+compiler+target+site%3Astackoverflow.com&t=osx&ia=web) to get help on those settings. – Basil Bourque Aug 05 '21 at 08:15
  • Is there another way of implimenting the same logic without records? I am not quite used to records and it feels kida strange –  Aug 05 '21 at 08:16
  • @Baron Sure, a record is just a shortcut for a regular class. See my edit showing such a class. That one line turns into 33! (which is why you may learn to love records) IDEs such as IntelliJ can convert a record to a conventional class. Tip: That's a great way to construct a mutable or more complicated class, start with a `record` holding all your fields, then have your IDE convert, thereby saving you a ton of typing and code-generation. – Basil Bourque Aug 05 '21 at 08:25
  • I would really appreciate if you could show me the same code without using record –  Aug 05 '21 at 08:42
  • @Baron I did. See my edit adding "Before Java 16, use a conventional class in place of that record one-liner. Here is the source-code equivalent of that record." followed by `final class WordAndCount {`… – Basil Bourque Aug 05 '21 at 08:46
  • yes but i dont really understand where should i place WordAndCount instance in the EngRus class –  Aug 05 '21 at 08:51
  • @Baron You can either exactly replace the one-liner `record` with those 33 lines, or your can declare that class separately. If you do not understand how to write and place class definitions, step back and study the Oracle tutorials (free of cost). – Basil Bourque Aug 05 '21 at 08:53
  • It works fine, I have another question. Is it possible to not to use lambdas as well? I mean is there any way of not using lambdas and streams, and write the code without them? Can we somehow change the line where we write lambda logic? –  Aug 05 '21 at 13:29
  • @Baron - I suggest you keep this post focused on one problem and post another question regarding your additional queries. – Arvind Kumar Avinash Aug 05 '21 at 17:36
  • @Baron I presume you are referring to the method references in the `Comparator`. Yes, you can replace lambdas and method references with longer conventional Java code. [Search Stack Overflow](https://duckduckgo.com/?q=site%3Astackoverflow.com+java+Comparator&t=iphone&ia=web) to see many such examples. And [Avinash is correct](https://stackoverflow.com/users/10819573/arvind-kumar-avinash), you should post additional narrowly-focused Questions with specific small code example rather than drag out a thread of Comments. – Basil Bourque Aug 05 '21 at 17:38