4

I have a string like the following:

{A}jahshs{b}jwuw{c}wuqjwhaha{d}{e}{f}jsj{g}

And I need to replace every {x} with a different string. The problem comes because this process will be repeated around 1000 times/second, so I need a optimized/fast way to do it.

Any idea? Boost replace? Boost format? Etc..

herohuyongtao
  • 49,413
  • 29
  • 133
  • 174
Pepeluis
  • 931
  • 2
  • 10
  • 18
  • 1
    `std::string::replace`, measure, and demonstrate that it's not fast enough? – Kerrek SB Mar 22 '14 at 00:02
  • But I should call replace for every {x} in the string, around 10. So 10x1000 replaces per second. – Pepeluis Mar 22 '14 at 00:07
  • Nothing will replace making tests and measurements on your end. There are so many variables. If you write up some code and it is still slower than you expect, we can at least look at your code and discuss. – Anthony Kong Mar 22 '14 at 00:13

1 Answers1

0
  1. preallocate all buffers

    ....

  2. profit

Oh, and don't spam. Sample code in 5 10 minutes.

Okay here goes: also Live On Coliru

#include <string>
#include <sstream>
#include <boost/utility/string_ref.hpp>

template <typename Range>
int expand(Range const& /*key*/)
{
    return rand()%42; // todo lookup value with key (be sure to stay lean here)
}

#include <iostream>
int main()
{
    static const std::string msg_template = "{A}jahshs{b}jwuw{c}wuqjwhaha{d}{e}{f}jsj{g}\n";

    std::ostringstream builder;
    builder.str().reserve(1024); // reserve ample room, not crucial since we reuse it anyways

    for (size_t iterations = 1ul << 14; iterations; --iterations)
    {
        builder.str("");
        std::ostreambuf_iterator<char> out(builder);

        for(auto f(msg_template.begin()), l(msg_template.end()); f != l;)
        {
            switch(*f)
            {
                case '{' : 
                    {
                        auto s = ++f;
                        size_t n = 0;

                        while (f!=l && *f != '}')
                            ++f, ++n;

                        // key is [s,f] now
                        builder << expand(boost::string_ref(&*s, n));

                        if (f!=l)
                            ++f; // skip '}'
                    }
                    break;
                default:
                    *out++ = *f++;
            }
        }
        // to make it slow, uncomment:
        // std::cout << builder.str();
    }
}

This runs in ~0.239s on my system. Thats about 68k expansions/second. Oops. In release build it does 4 million expansions/second. On Coliru it reaches almost 1 million expansions/second.

Room for improvement:

  • you can prevalidate the input
  • if you know the parameter keys are always 1 letter, then you can simply by replacing the string_ref with the char, and by not looping for the '}'.
  • you can precalculate the indicecs of arguments and jump ahead. The benefit here is not so certain (sequential memory access is very good on some processors and the naive approach may be faster)
sehe
  • 374,641
  • 47
  • 450
  • 633
  • Each { x } is a macro name, and the content to be replaced could vary, besides the main string could vary too. So, is a kind of macro expander. – Pepeluis Mar 22 '14 at 12:18
  • Yes? Did you run the code? Is there anything you can't figure out? Did you so the the `expand` function? As you'll note, it _"is a kind of macro expander"_. I made doubly sure you can return any thing that is IO streamable. Maybe this [2-line edit will help](http://coliru.stacked-crooked.com/a/becbdb80daa740b7)? – sehe Mar 22 '14 at 13:19
  • This can be made more efficient [easily by making `insert_expanded` take the builder stream](http://coliru.stacked-crooked.com/a/be82fad7b1442787). An example that mimicks mailmerge: http://coliru.stacked-crooked.com/a/cde6e91f3d898888 – sehe Mar 22 '14 at 13:19
  • Oh, and [with randomized templates](http://coliru.stacked-crooked.com/a/34c61482d6d76830) (_"besides the main string could vary too"_). Last but not least: **[here is an old answer](http://stackoverflow.com/a/9405546/85371)** of mine implementing a yet-more-versatile macro expansion system with Boost Spirit (namely, nested macros, escaped macro characters, recursive evaluation). Cheers – sehe Mar 22 '14 at 14:17
  • Fixed the benchmark. Now doing 4 Mio/s expansions on my box, [~1Mio/s on Coliru](http://coliru.stacked-crooked.com/a/d4b19b312b5a570f) – sehe Mar 22 '14 at 16:36
  • Thanks for all your help. Could you explain a bit that code from your last link? – Pepeluis Mar 22 '14 at 20:26
  • I'm afraid that would be a bit of a lengthy topic. The features are described in that answer and the test cases. The implementation leverages Boost Spirit as a PEG parser generator. I generate the output much as I do here, except I do so from within the semantic actions of the grammar. That solution will be quite a bit slower than the one presented here, but it's more versatile because using a parser generator makes it easier to adapt your template grammar for more complex behaviour. – sehe Mar 22 '14 at 20:29