2

I have a method which takes in parameters in the form of a vector from another vector. This vector can be of the size 2, 3 or 4 elements.

I want to count the frequency of every word in that vector. For example, if the vector contained the strings : "hello", "my" , "hello" , I want to output an array that is [2, 1] where 2 is the frequency of hello and 1 is the frequency of my.

Here is my attempt after reading a few questions on this website:

    int vector_length = query.size();
    int [] tf_q = new int [vector_length];
    int string_seen = 0;

    for (int p = 0; p< query.size(); p++)
    {
        String temp_var = query.get(p);

        for (int q = 0; q< query.size(); q++)
        {
            if (temp_var == query.get(q) )
            {
                if (string_seen == 0)
                {
                    tf_q[p]++;
                    string_seen++;
                }

                else if (string_seen == 1)
                {
                    tf_q[p]++;
                    string_seen = 0;
                    query.remove(p);
                }
            }
        }
    }

    System.out.print(Arrays.toString(tf_q));

What is the right direction to go?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
user3369038
  • 119
  • 1
  • 8
  • 1
    It seems you do not know how to compare strings in Java. See http://stackoverflow.com/questions/513832/how-do-i-compare-strings-in-java – PM 77-1 Mar 08 '14 at 22:06
  • Thanks for the reply. By doing so, wont I be recreating another LinkedHashSet that looks just like the array, tf_q? Sure I'll work on my naming! – user3369038 Mar 08 '14 at 22:07
  • @user3369038 You should use a HashMap. I provided the implementation below. It associates keys (your unique words) with values (the counts of those unique words) – Brian Mar 08 '14 at 22:11

3 Answers3

1

Use a HashMap of type to track the unique string values you encounter that count each word

String[] vector // your vector
Map<String, Integer> stringMap = new HashMap<String, Integer>();

for (int i = 0; i < vector.length; i++) {
  if (stringMap.containsKey(vector[i]) {
    Integer wordCount = stringMap.get(vector[i]);
    stringMap.put(vector[i], new Integer(wordCount + 1));
  }
  else {
    stringMap.put(vector[i], new Integer(1));
  }
}
Brian
  • 7,098
  • 15
  • 56
  • 73
  • With a set there is no need for the `contains` check. The set takes care of this for you. – Gene Mar 08 '14 at 22:11
  • @Gene Thanks for the information on the contains logic :) I had accidentally written a Set implementation before I read the question more carefully. I determined the OP is actually asking for something that a Map would better suit – Brian Mar 08 '14 at 22:13
  • No need a apologize. He changed the question! – Gene Mar 08 '14 at 22:13
  • Gene, I didnt change the question, somebody else edited it. @Brian, thanks a lot for the reply. Unfortunately it doesn't seem to work. I modified it as follows: Map stringMap = new HashMap(); for (int i = 0; i < query.size(); i++) { if (stringMap.containsKey(query(i)) ) { Integer wordCount = stringMap.get(query(i)); wordCount = wordCount + 1; stringMap.put(query(i), wordCount); } else { stringMap.put(query(i), new Integer(1)); } } System.out.println(uniqueSet.size()); – user3369038 Mar 08 '14 at 22:32
  • The problem seems to be at this line : stringMap.put(query(i), wordCount); – user3369038 Mar 08 '14 at 22:34
0
    String[] input = {"Hello", "my", "Hello", "apple", "Hello"};
    // use hashmap to track the number of strings
    HashMap<String, Integer> map = new HashMap<String, Integer>();
    // use arraylist to track the sequence of the output
    ArrayList<String> list = new ArrayList<String>(); 
    for (String str : input){
        if(map.containsKey(str)){
            map.put(str, map.get(str)+1);
        } else{
            map.put(str, 1);
            list.add(str); // if the string never occurred before, add it to arraylist
        }
    }


    int[] output = new int[map.size()];
    int index = 0;
    for (String str : list){
        output[index] = map.get(str);
        index++;
    }

    for (int i : output){
        System.out.println(i);
    }

This should be your answer! Result is in "int[] output"

Tommy.Z
  • 14
  • 4
0

If you want to maintain the relation between each word and the frequency of that word, then I suggest that you use a HashMap instead. For example:

Map<String,Integer> histogram = new HashMap<String,Integer>();
for (String word : query)
{
    Integer count = histogram.get(word);
    if (count == null)
        histogram.put(word,1);
    else
        histogram.put(word,count+1);
}

At this point, you can (for example) print each word with the corresponding frequency:

for (String word : histogram.keySet())
    System.out.println(word+" "+histogram.get(word));

Or you can obtain an array which contains only the frequencies, if that's all you want:

Integer[] array = histogram.values().toArray(new Integer[histogram.size()]);

Or even a collection, which is just as useful and convenient as any native array:

Collection<Integer> collection = histogram.values();
barak manos
  • 29,648
  • 10
  • 62
  • 114