-3

I tried with this way .. but it too slow ,

I need to clone vector contains millions of char * .`

vector <char * > A ; // contains more than 1 million item.
vector <char * > B;



char * buffer=NULL;
for(long i = 0; i < A.size(); i++)
{
        buffer = (char*) malloc (strlen(A[i])+1);
        strcpy(buffer,A[i]);
        buffer[strlen(A[i])+1]='\0';
        B.push_back(buffer);
}

there is way to clone vector faster than this ??

its very slow .. If I use

vector <char * > B=A; // when free A .. I have double free Exception (run time error ) ..

Thanks .

πάντα ῥεῖ
  • 1
  • 13
  • 116
  • 190
IbrahimAsad
  • 516
  • 1
  • 5
  • 16
  • 2
    Yes, it's too slow. There's nothing you can do about it. Maybe improve your algorithm so that you don't need to copy it at all. –  Jan 19 '14 at 10:21
  • Your 2nd example doesn't copy the strings, but leaves the original pointers, hence the double free exception. Your 1st form is correct. – πάντα ῥεῖ Jan 19 '14 at 10:21
  • should remove `C` tag.. – Grijesh Chauhan Jan 19 '14 at 10:22
  • 1
    @H2CO3, +1, also you may want to use strings so you can avoid strcpy() and malloc. – Netwave Jan 19 '14 at 10:22
  • @DanielSanchez Yes, of course. That cast before `malloc()` should be a red flag. –  Jan 19 '14 at 10:23
  • OH, I used the vector as member in class so when use copy constructor. so I have to clone it . .. – IbrahimAsad Jan 19 '14 at 10:24
  • @DanielSanchez strings in millions record take 400% of memory more than char * . – IbrahimAsad Jan 19 '14 at 10:26
  • @IbrahimAsad, well then if it needs to be fully effycient, just use the vector that contains the char* and dont copy it, why the need of copy it when you have stored the values?? – Netwave Jan 19 '14 at 10:31
  • @DanielSanchez That happens when I use copy constructor . – IbrahimAsad Jan 19 '14 at 10:35
  • 1
    @IbrahimAsad, consider alternate ways of avoiding the situation: use a std::shared_ptr>, adapt your algorithm to use process strings one by one, or reference-count the char* themselves (the copying then would only copy the pointer value). – utnapistim Jan 19 '14 at 10:35
  • 2
    The important question is: do you need a read-only copy or do you want to clone the data itself? – rubenvb Jan 19 '14 at 11:27
  • @rubenvb , I should do that because I clone entire object ... so also data must cloned .. – IbrahimAsad Jan 19 '14 at 11:53

2 Answers2

2

Assuming that you prefer a faster algorithm than saving space, you could try to implement a technique called reference counting / copy-on-write yourself. The idea it to avoid copying the string unless it is absolutely necessary (for example if an edit to the string is performed).

The topic (which would be too long to explain here) is described (with an old implementation) in Scott Meyer's "More effective C++" book. Specifically there's Item 30 (Reference Counting) that will go deep down in the construction of the string in question. The whole purpose of it is to enhance algorithms like yours:

The second motivation for reference counting is simple common sense. If many objects have the same value, it’s silly to store that value more than once. Instead, it’s better to let all the objects with that value share its representation. Doing so not only saves memory, it also leads to faster-running programs, because there’s no need to construct and de- struct redundant copies of the same object value.

For the implementation you can easily use std::shared_ptr (since C++11) or boost::shared_ptr (prior to C++11) which provides most of the mechanism you need to reference-count the string:

std::shared_ptr<char> string(new char[n], [](int *ptr) { delete[] ptr; });

If you needn't to modify the string, (but I highly doubt that, right?), you should declare them as const char* and avoid the reallocation.

Community
  • 1
  • 1
Shoe
  • 74,840
  • 36
  • 166
  • 272
  • Copy-on-write strings are not allowed in C++11, and were rare before. – rubenvb Jan 19 '14 at 10:42
  • @Jefffrey 21.4.1 '[The data of the copy constructed string] points at the first element of an allocated copy of the array whose first element is pointed at by [the source] str.data()' –  Jan 19 '14 at 10:50
  • [Reference](http://stackoverflow.com/questions/12199710/legality-of-cow-stdstring-implementation-in-c11). Oh, and why it's not a good idea to write something like this yourself? Because the Standard Library has generic tools that handle this for you, in the simplest form a smart pointer or `vector` (but that last one seems to be out of the question due to memory constraints) – rubenvb Jan 19 '14 at 10:50
  • Thread-safe Copy-on-write http://stackoverflow.com/questions/4309548/thread-safe-implementation-of-the-copy-on-write-cow-idiom might be useful here –  Jan 19 '14 at 10:58
  • @rubenvb, thanks. Could you please check if the current version of the answer is ok? – Shoe Jan 19 '14 at 11:01
  • @DieterLücking, I've also added it to the answer body. Thanks. – Shoe Jan 19 '14 at 11:06
  • @Jefffrey it's better `:-)`, but the question is underspecified. Too many unknowns to give a to the point answer `:-/` – rubenvb Jan 19 '14 at 11:21
  • @rubenvb, Hmm. Well, the concepts would be very hard to explain well here. I just thought to give a general direction towards the solution. Do you think I could improve it somehow? Should I delete it as it is too broad? – Shoe Jan 19 '14 at 11:24
  • 2
    @Jefffrey no, relax, it's a perfectly fine answer to the question in its current form. – rubenvb Jan 19 '14 at 11:26
0

You can change

for(long i = 0; i < A.size(); i++)
{
        buffer = (char*) malloc (strlen(A[i])+1);
        strcpy(buffer,A[i]);
        buffer[strlen(A[i])+1]='\0';
        B.push_back(buffer);
}

to

B.resize(A.size());
for(long i = 0; i < B.size(); ++i)
{
        B.at(i) = new char[ strlen(A.at(i)) ];
        for(int j=0; B.at(i)[j] = A.at(i)[j]; ++j);
}

Smaller code may speedup things. you may also try tweaking with compiler's optimization settings.

Harshil Sharma
  • 2,016
  • 1
  • 29
  • 54