1

I have string std::string str(s); and s has list of words like s="one two three one one two..."

And I want occurrence of each word and at the end word with max occurrence value.

I have declared occurrence type :

typedef std::unordered_map<std::string> occurrences;
occurrences s1;

and I want to assign content of s into s1, How can I do it?

After that here is the code to get occurrence of each words which has some mistake:

for (std::unordered_map<std::string, int>::iterator it = s1.begin();
                                                    it != s1.end();
                                                    ++it)
    {
        std::cout << "word :" << it->first << "occured   " << it->second <<  "  times \n";
    }

Can Any one tell me how can I get occurance of each word "one" , "two" here?

As per request I am adding original code here:

#include <string>
#include <iostream>
#include <unordered_map>

int main()
{
    typedef std::unordered_map<std::string,int> occurrences;
    occurrences s1;
    s1.insert(std::pair<std::string,int>("Hello",1));
    s1.insert(std::pair<std::string,int>("Hellos",2));

    for (std::unordered_map<std::string, int>::iterator it = s1.begin();it != s1.end();++it)
    {
        std::cout << "word :" << it->first << "occured   " << it->second <<  "  times \n";
    }

    return 0;
}

Improved code:

#include <iostream>
#include <string>
#include <sstream>
#include <vector>
#include <unordered_map>

int main()
{
    typedef std::unordered_map<std::string,int> occurrences;
    occurrences s1;

    // the string we're splitting.
    std::string s = "one two three one one two";

    int maxcount=0,temp=0;  
    std::vector<std::string> vestring;

    // create an input string stream
    std::istringstream iss(std::move(s));

    // now simply extract strings until you reach end-of-file.
    while (iss >> s)
    {
        temp=++s1[s];
    if(temp>=maxcount)
    {
        maxcount=temp;
        vestring.push_back(s);
    }

    }
    for (occurrences::const_iterator it = s1.cbegin();it != s1.cend(); ++it)
        std::cout << it->first << " : " << it->second << std::endl;;

    return 0;
}
user123
  • 5,269
  • 16
  • 73
  • 121
  • Are you familiar with the [`std::istringstream<>`](http://en.cppreference.com/w/cpp/io/basic_istringstream) class ? You may also find [`std::istream_iterator<>`](http://en.cppreference.com/w/cpp/iterator/istream_iterator) helpful. – WhozCraig Aug 23 '13 at 06:43
  • @WhozCraig: I was not knowing. I read it, but did not get how to use it in my case! – user123 Aug 23 '13 at 06:48

1 Answers1

2

You have a good start. And you're using the right class for your counter, which is more than most people. The mechanism you're need is the ability to parse substrings out of a larger string, with whitespace being your separator. A std::istringstream<> will do this very nicely.

Sample

#include <iostream>
#include <string>
#include <vector>
#include <sstream>
#include <unordered_map>

int main()
{
    typedef std::unordered_map<std::string,int> occurrences;
    occurrences s1;

    // the string we're splitting.
    std::string s = "one two two three one one two";

    // create an input string stream. we use std::move() to give
    //  the implementation the best chance at simply reassigning
    //  the buffer from the string to the stream
    std::istringstream iss(std::move(s));

    // this will hold all occurrances of the strings with
    //  maximum count. each time a new maximum is found
    //  we clear the vector and push the new leader in.
    std::vector<std::string> most;
    int max_count = 0;

    // now simply extract strings until you reach end-of-file.
    while (iss >> s)
    {
        int tmp = ++s1[s];
        if (tmp == max_count)
        {
            most.push_back(s);
        }
        else if (tmp > max_count)
        {
            max_count = tmp;
            most.clear();
            most.push_back(s);
        }
    }

    // send our map to stdout.
    for (occurrences::const_iterator it = s1.cbegin();it != s1.cend(); ++it)
        std::cout << it->first << " : " << it->second << std::endl;;

    // send the strings with max_count to stdout
    std::cout << std::endl << "Maximum Occurrences" << std::endl;
    for (std::vector<std::string>::const_iterator it = most.cbegin(); it != most.cend(); ++it)
        std::cout << *it << std::endl;

    return 0;
}

Output

three : 1
two : 3
one : 3

Maximum Occurrences
one
two

This should get you started. Finding the largest count should no problem for you (hint: after each insertion your map has the current count of the word just processed).

There are even more efficient ways of doing this, but its probably more-than-enough for what you are doing right now.

WhozCraig
  • 65,258
  • 11
  • 75
  • 141
  • Thanks, but wht it takes so much time to compile? And can you pleas explain me three line `std::istringstream iss(std::move(s)); while (iss >> s) ++s1[s];` what exaclty it does here? I show tutorial but I am clear! – user123 Aug 23 '13 at 10:04
  • Also I could nt get the logic of getting word with maximum occurrence! – user123 Aug 23 '13 at 10:39
  • @Karimkhan Well, you're accumulating a counter for each work Expand that `while` loop to check it the counter-value just-incremented is greater than a `maxcount` value you keep outside the loop. If it is, just save the string in a string var outside the loop and update `maxcount` to be the new maximum. When your loop is done, it will have the highest occurring string (unless there is a tie, in which case it will have the *first* one that reached that value, or the last one if you use `>=`). – WhozCraig Aug 23 '13 at 15:16
  • 1
    @The first line is the constructor for the string stream. I don't need that string afterward `std:move()` is sued to give it a fighting chance of move-construction. The `while()` loop evaluates the extraction from the steam to the string variable, and if it succeeds, take the string extracted, use it as a key into our map (inserting if not already there) and increment the referenced `int` at that key. Thats about it. – WhozCraig Aug 23 '13 at 16:12
  • ok, it was so easy. Using >=maxcount I can store all the string which are with maxcount values into some other string array. int std::string is there any inbuild function or object which works as array? – user123 Aug 23 '13 at 18:03
  • 1
    @Karimkhan Honestly, I didn't understand your question. If I were coding this I would keep a `std::vector` and a `maxcount` value. Each time I update the map I check `maxcount` against the just-updated counter. If they are *equal* push the string into the vector. If the just-updated count is *greater*, then `clear()` the vector, push the string just read into the vector, and update `maxcount` to the new maximum. The `clear()` is important. When this is done the vector will contain all strings that have the `maxcount` occurrences (which may be only one). – WhozCraig Aug 23 '13 at 18:09
  • You answered what I want! but maxcount would be compared with which values in above code? – user123 Aug 23 '13 at 18:26
  • 1
    Think about it awhile. it will come. – WhozCraig Aug 23 '13 at 18:29
  • can you please review the code I updated in your answer? Sorrry I made mistake, I should have added it in my original question. – user123 Aug 23 '13 at 19:04
  • @Karimkhan I have a work-thing I need to do, but I'll come back and update the answer when I get a chance. – WhozCraig Aug 23 '13 at 19:07
  • ok no issue. I added code in my original question. please review it. – user123 Aug 23 '13 at 19:10
  • 1
    Your update was very close. see the updated code in this answer, and think about how the vector is cleared and the new leader is pushed in whenever max_count is surpassed by the new leader. I added an extra "two" into the test data to demonstrate the results of having potentially more than one max-count string. – WhozCraig Aug 23 '13 at 20:10
  • Thanks alot, can you please suggest me some reference to clear concepts of c++? In book overview is gives so this sort of concept does not get cleared! – user123 Aug 24 '13 at 05:36
  • 1
    @Karimkhan Trying to recommend a tutorial is difficult. Check out the [Definitive C++ Reading List](http://stackoverflow.com/questions/388242/the-definitive-c-book-guide-and-list) maintained on this site. And spend time reading other peoples question and answers you find interesting. Above all, never stop studying (I still learn something new almost every day). – WhozCraig Aug 24 '13 at 05:39
  • I tried to get your contact info from your profile, but unlucky! May concern you in future! – user123 Aug 24 '13 at 05:46
  • @WhozCraig: It gives most occuring word, can you please tell me how can I get count value for it. And Also can you please tell me how to get three most occuring words with their count value? – Catty Aug 31 '13 at 10:54
  • @WhozCraig: Is this correct way to get occurrence of three highest? : `for (std::map >::iterator hit = most.begin(); hit != most.end(); ++hit) { //std::cout << hit->first << ":"; for (std::vector::iterator vit = (*hit).second.begin(); vit != (*hit).second.end(); vit++){ std::cout << hit->first << ":"; std::cout << *vit << "\n"; } } ` – user123 Sep 02 '13 at 06:45