0

I've been trying to get the code for this but i still cant. This code segment is the closest I can do. What am I missing? I am trying to do this code without Hash.

    // Read all the words from the dictionary (text.txt) into an array
    BufferedReader br = new BufferedReader(new FileReader("text.txt"));
    int bufferLength = 1000000;
    char[] buffer = new char[bufferLength];
    int charsRead = br.read(buffer, 0, bufferLength);
    br.close();
    String text = new String(buffer);
    text = text.trim();
    text = text.toLowerCase();
    String[] words = text.split("\n");

    System.out.println("Total number of words in text: " + words.length);

    //Find unique words:
    String[] uniqueText = words;
    int[] uniqueTextCount = new int[uniqueText.length];

    for (int i = 0; i < words.length; i++) {
        for (int j = 0; j < uniqueText.length; j++) {
            if (words[i].equals(uniqueText[j])) {
                uniqueTextCount[j]++;
            } else {
                uniqueText[i] = words[i];
            }
        }
        System.out.println(uniqueText[i] + " for " + uniqueTextCount[i]);
    }
}
Emz
  • 1,280
  • 1
  • 14
  • 29
Han
  • 131
  • 1
  • 16
  • Sorry about the format! Im new to stackoverflow and programming – Han Nov 26 '15 at 22:30
  • 1
    Please copy and paste the code here instead of posting an image. Makes it easier for users here to try the code themselves. – Emz Nov 26 '15 at 22:34
  • String[] uniqueText = words; int[] uniqueTextCount = new int[uniqueText.length]; for (int i = 0; i < words.length; i++) { for (int j = 0; j < uniqueText.length; j++) { if (words[i].equals(uniqueText[j])) { uniqueTextCount[j]++; } else { uniqueText[i] = words[i]; } } System.out.println(uniqueText[i] + " for " + uniqueTextCount[i]); } – Han Nov 26 '15 at 22:38
  • Edit your original post with that, instead of the image. – Emz Nov 26 '15 at 22:39
  • i've tried but kept getting an error message – Han Nov 26 '15 at 22:39
  • You also have to split on whitespaces, no? – Emz Nov 26 '15 at 22:43
  • you can check http://stackoverflow.com/questions/26958118/finding-unique-numbers-from-sorted-array-in-less-than-on if you are not using hashmap – Ashish Patil Nov 27 '15 at 01:57

2 Answers2

1

From your original code I am assuming that:

  • text.txt contains a single word in each line.
  • You want to count the number of times each word occurs (rather than "similar words" as you write in the title).

Perhaps the first thing is that BufferedReader allows line-by-line reading:

for (String line; (line = br.nextLine()) != null; ) {
  // Process each line, which in this case is a word.
}

It is more desirable to process line by line rather than read the whole file, because your program will need to use more memory (as much as the size of the file) where you could get away with using less.

Now if we think about the requirement, the desired output is a map from distinct words to their counts. This should come before the for-loop above.

// A HashMap would also work, but you have specified that you do not want
// to use hashing.
Map<String, Integer> distinctWordCounts = new TreeMap<>();

And when initialized thus, in each iteration in the loop (i.e., for each line we encounter), we can do the following:

if (distinctWordCounts.hasKey(line)) {
  // We have seen this word. Increment the count we've seen it.
  distinctWordCounts.put(line, distinctWordCounts.get(line) + 1);
} else {
  // We have never seen this word. Set the count seen to 1.
  distinctWordCounts.put(line, 1);
}

The above code incurs a little more overhead than seems optimal because the if case involves three traversals where we could get away with one. But this is probably a story for another day unless you have reasons to concern yourself with non-asymptotic speed improvements.

At the end of the day, we can traverse distinctWordCounts for the count of words

for (Entry<String, Integer> entry : distinctWordCounts.entrySet()) {
  System.out.println(entry.getKey() + " occurs " + entry.getValue() + "times.");
}
Jae Heon Lee
  • 1,101
  • 6
  • 10
1

It sounds like you're just trying to count the number of distinct occurrences for each word? If that's the case, you could do something like this:

String[] array = {"a", "a", "b", "c", "c", "c", "d", "e", "f", "f"};
Map<String, Long> map = new HashMap<>();

Stream.of(array)
      .distinct()
      .forEach(s -> map.put(s, 
          Stream.of(array)
                .filter(s::equals)
                .count()));

If you just want the unique words:

String[] unique = Stream.of(array)
                        .distinct()
                        .toArray(String[]::new);