1

Let's say you are given a string "banana".

You want to find out how many times "ana" can be found in "banana". So, that number is 2.

string s = "banana";
int num = 0,pos = 0;
pos = s.find("ana");
while(pos!=string::npos) {
    num++;
    pos = s.find("ana",pos+1);
}
cout<<num<<endl;

The thing is, I want to write shorter code for this. What functions can be used ? I tried using search() but it's not what I wanted. Count() is only for characters.

Are there any other functions that can help me with doing this ? (Boost is not allowed on competitions, so not this one).

SomeName
  • 909
  • 2
  • 8
  • 15
  • This is about as good as you could ask for. I suppose a you could use a regex to remove a line of code or two but regexes can be pretty heavy. – NathanOliver May 14 '19 at 15:12
  • Competitions :) . what competitions? where? if you tell me I might not give you the answer but instead post it to the competitions for my self :) Also shorter code does not necessarily mean faster program! – AKL May 14 '19 at 15:14
  • Competitions are country-level for pupils :/ – SomeName May 14 '19 at 15:17
  • Shouldn't it be like```pos = s.find("ana", pos + 2);``` since "ana" is what you are looking for not "nan" ? – AKL May 14 '19 at 15:28
  • 1
    I would be surprised if a competition permitted contestants to obtain outside assistance. – Raymond Chen May 14 '19 at 16:13
  • Your question is already asked. Check this one out https://stackoverflow.com/questions/22489073/counting-the-number-of-occurrences-of-a-string-within-a-string – yara raffoul May 14 '19 at 17:00
  • I wanted to ask is there shorter way to do that ? – SomeName May 14 '19 at 17:29
  • I think the best way in terms of speed would be to use [Boyer–Moore string-search algorithm](https://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string-search_algorithm). You search for the best way in terms of what? – KamilCuk May 15 '19 at 06:46
  • @SomeName I wrote a new answer. If it solved the problem please consider voting/accepting it! (https://stackoverflow.com/help/someone-answers) – AKL May 16 '19 at 09:33

4 Answers4

1

Assuming that finally you intend to also count any possible replication of a phrase, here are 3 generic function templates which work for almost any container or pointer/array , providing that they are given the right first/last iterator instead of container!

#include <iterator>
#include <algorithm>
#include <map>
#include <vector>

//finding the occurrence for an specific case within a range
template<typename rng_t__, typename dif_t__ = typename std::iterator_traits<rng_t__>::difference_type>
dif_t__ occurrence(const rng_t__ rng_fst_, const rng_t__ rng_lst_, const rng_t__ schd_fst_, const rng_t__ schd_lst){
    dif_t__ counter = 0;
    for (rng_t__ it = rng_fst_; (it = std::search(it, rng_lst_, schd_fst_, schd_lst))++ != rng_lst_; ++counter);
    return counter;
}
//finding the replications for all subsets with certain length within a range
template<typename rng_t__, typename dif_t__ = typename std::iterator_traits<rng_t__>::difference_type, typename val_t__ = typename std::iterator_traits<rng_t__>::value_type>
dif_t__ replications(const rng_t__ rng_fst_, const rng_t__ rng_lst_, dif_t__ lnt_){
    if(!lnt_ or lnt_ >= std::distance(rng_fst_, rng_lst_)) return 0;
    rng_t__ it_lst = rng_fst_;
    for (--lnt_; lnt_--; ++it_lst);
    std::map<std::vector<val_t__>, dif_t__> cases;
    for (rng_t__ it_fst = rng_fst_; it_lst++ != rng_lst_; ++it_fst){
        auto it_rslt_pair = cases.insert({{it_fst, it_lst}, 0});
        if(! it_rslt_pair.second) ++(it_rslt_pair.first->second);
    }
    dif_t__ counter = 0;
    for (const auto& a_case : cases) counter += a_case.second;
    return counter;
}
//finding the replications for all subsets with all possible lengths within a range
template<typename rng_t__, typename dif_t__ = typename std::iterator_traits<rng_t__>::difference_type, typename val_t__ = typename std::iterator_traits<rng_t__>::value_type>
dif_t__ replications(const rng_t__ rng_fst_, const rng_t__ rng_lst_){
    const dif_t__ rng_lnt = std::distance(rng_fst_, rng_lst_);
    dif_t__ counter = 0;
    for (dif_t__ a_lnt = 0; ++a_lnt < rng_lnt; counter += replications(rng_fst_, rng_lst_, a_lnt));
    return counter;
}

#include <string>

int main(int argc, char** argv) {

    std::string range = "banana", searched = "ana";

    std::cout<< "total occurrence for the ana" << std::endl;
    std::cout<< occurrence("banana", "banana" + 6, "ana", "ana" +3) << std::endl;
    std::cout<< occurrence(range.begin(), range.end(), searched.begin(), searched.end()) << std::endl;

    std::cout<< "total replications for every phrase from banana with length of 3" << std::endl;
    std::cout<< replications("banana", "banana" + 6, 3) << std::endl;
    std::cout<< replications(range.begin(), range.end(), 3) << std::endl;

    std::cout<< "total replications for every phrase from banana with every possible length" << std::endl;
    std::cout<< replications("banana", "banana" + 6) << std::endl;
    std::cout<< replications(range.begin(), range.end()) << std::endl;

    return 0;
}

possible output:

total occurrence for the ana
2
2
total replications for every phrase from banana with length of 3
1
1
total replications for every phrase from banana with every possible length
6
6

Good luck with the competition!

AKL
  • 1,367
  • 7
  • 20
0

I want to write shorter code for this

You want to use while (true) and break:

std::string s{"banana"};
std::string::size_type pos{0};
int num{0};
while (true) {
    pos = s.find("ana", pos);
    if (pos == std::string::npos) break;
    pos += 2; // next possible place for an "ana";
    ++num;
} 
std::cout << num << "\n";
Paul Evans
  • 27,315
  • 3
  • 37
  • 54
  • Is using ```while(true)``` and ```break``` really faster? – AKL May 14 '19 at 15:33
  • @AKL - only way to tell is inspect the assembly or measure the duration. – 2785528 May 14 '19 at 15:33
  • @2785528 Thank you for your wise, broad and typical answer! what I really meant was, why should it be faster? – AKL May 14 '19 at 16:03
  • @AKL OP didn't ask for faster code, OP asked for shorter code. This way uses only one line for `find`. – Paul Evans May 14 '19 at 16:26
  • why should it be faster? I don't know that it should. My experience is that your desktop (and mine) is a complex system. The size or nature of the "code change" has no relation to "performance change". The system is non-linear. – 2785528 May 14 '19 at 16:49
  • @2785528 what you are experiencing my friend, is CPU being warmed up during raising up their clock speed from different conditions. Also not to forget Operating systems change the hard-working core/s to avoid them being overheated. And what I meant was a significant change in speed at almost equal conditions! If one claims something is faster one should be able to explain it in term of logic or at least repeated experience! – AKL May 14 '19 at 20:27
0

You can use a for loop for this:

string s = "banana";
int num = 0;

for (auto pos = s.find("ana"); pos != string::npos; pos = s.find("ana",pos+1))
    num++;
cout << num << endl;

As you can see, I used auto as type for pos, so it will have the correct type std::string::size_type.

mch
  • 9,424
  • 2
  • 28
  • 42
  • Thank you for repeating my answer! Would you also be willing to take a down-vote from @AlanBirtles too? :) First _ mine is even shorter! (i am talking about the code). Second _ you spend half of the amount of letters that you saved by using auto with repeating pos in finding and equality check! – AKL May 14 '19 at 16:37
  • Also why is every body using post-increment ```num++``` instead of pre-increment ```++num``` which is faster and has less overhead? – AKL May 14 '19 at 16:41
  • @AKL In my opinion it is the best solution regarding the readablitily and the shortness of the code. Also using `num++;` or `++num;` here will end up in the same assembler code: https://godbolt.org/z/GLwgEj – mch May 14 '19 at 16:50
  • 1- since it is "readability" not "readablitily" I am not sure how good you can judge readability! 2- readability was not what the questioner asked for. 3-you guys are so obsessed with assembler and optimizer that became lazy. Although it might not matter in this case, nothing beats developing a good habit. 4- please go ask AlanBirtles to remove his down-vote:) – AKL May 14 '19 at 17:04
  • @AKL this isn't the same as your answer and doesn't have the same bug – Alan Birtles May 14 '19 at 17:41
0

Probably the simplest code without sacrificing readability is something like this:

std::string s = "banana";
int num = 0;
size_t pos = 0;

while ((pos = s.find("ana", pos)) != std::string::npos)
{
    num++;
    pos++;
}
std::cout << num << "\n";

or as a for loop:

std::string s = "banana";
int num = 0;
for (size_t pos = 0; (pos = s.find("ana", pos)) != std::string::npos; num++, pos++)
{
}
std::cout << num << "\n";
Alan Birtles
  • 32,622
  • 4
  • 31
  • 60
  • Dude you were so obsessed with testing the "anana" case that you wrote it for the loop case. Since it was under 6 characters I couldn't edit it for you! – AKL May 15 '19 at 08:58
  • And speaking of different cases, there is a possibility(no matter how much small) that ```std::string::npos``` is of an integer type of smaller/greater than the size_t and since it is usually filled with -1, your code can also contain an infinite loop if one of string::size_type or size_t is unsigned where one of them is smaller than int and the other is not. There is a reason why size_type is defined in string. Also should I start a Post on when to use post increment operators? – AKL May 15 '19 at 10:03
  • @AKL `std::string::size_type` is always `std:size_t`. Post increment operators are not generally more expensive for integer types, they only become (potentially) more expensive for structures like iterators, the compiler optimises away the difference – Alan Birtles May 15 '19 at 12:06
  • As far as my readings ```size_type``` in ```string``` is defined by the ```size_type``` of its ```Allocator``` and one can choose to use one other that default ```std::allocator```. But maybe I am wrong and not that I doubt you but only to be able to prove this to someone else if it ever happens, I would appreciate it if you could kindly provide me a reference for "std::string::size_type is always std:size_t". – AKL May 15 '19 at 12:36
  • I agree with you about post increment operations, it is just about developing a good habit, not having double standards and readability of the code. Because of reading not authentic materials, I was not even aware of pre increment operator for a long time! – AKL May 15 '19 at 12:36
  • `std::string`'s allocator is always `std::allocator` who's `size_type` is `std::size_t` – Alan Birtles May 15 '19 at 13:25