1

Is there a difference in how g++ handles this situation because of the differnece in code? To a beginner it seems like exactly the same code tbh.

I should mention, that both trees are massive in size, each containing around ten or so tensors with each an estimated element count of 10^5 or so integers.

EDIT: All numbers are allocated to the heap, with only one pointer from the tree-root actually laying on the stack.

{
        std::cout << "\nTrial #" << i << std::endl;
        v = createV(10, 5, 10);    

        ExTree<int> treeOpt = build_opt(v);
        {
            //...
            treeOpt.evaluate();
        }
        ExTree<int> treeNai = build_naive(v);
        {
            //...
            treeNai.evaluate();
        }
}

and

{
        std::cout << "\nTrial #" << i << std::endl;
        v = createV(10, 5, 10);    
        ExTree<int> treeNai = build_naive(v);
        ExTree<int> treeOpt = build_opt(v);
        {
            //...
            treeOpt.evaluate();
        }
        {
            //...
            treeNai.evaluate();
        }
}

I am asking this, because it seems to actually make a difference and i would like to know, why? Or, to ask more precisely, does the compiler realize, that treeOpt wont be used again after evaluate and frees to memory? The second code piece acutually causes std::bad_alloc to happen more often.

Clebo Sevic
  • 581
  • 1
  • 7
  • 17
  • Besides the order you call `build_naive` and `build_opt`, the two snippets do indeed seem equivalent. Perhaps the problem is something else, that is only manifesting itself more commonly using the second variant. I recommend you use a memory debugger (like e.g. [Valgrind](http://valgrind.org)) to help you find memory problems. – Some programmer dude Mar 04 '19 at 13:32
  • 2
    It's hard to tell with the limited code you've shown. What optimization flags and g++ version are you using? You can see exactly what g++ (or clang or several other compilers) generate by using Godbolt.org. Here's your [first snippet](https://godbolt.org/z/DRM4vU), and [here's the second](https://godbolt.org/z/DRM4vU), each with a little extra dummy stuff. As you can see, GCC 8.3 does not destroy one earlier in either case. Does it behave differently for you on a different compiler besides GCC? – metal Mar 04 '19 at 13:42
  • 1
    There could also be an issue with memory fragmentation. The early allocation in the first example can find the large memory block it needs, but in the second case the call to `treeOpt.evaluate()` allocated some memory that it doesn't free that takes up space in the middle of the memory pool, such that there isn't a large enough single block to meet the allocation request. – 1201ProgramAlarm Mar 04 '19 at 14:24
  • 1
    Have you checked that those calls actually are independent in order? I mean, they sound like they *should* be, but *should* is not the same as *they are*. Even if there is no evil global variable floating around, there might be some variable injected into both for some reason or some static variable that creates a dependence. I'd start ruling that out before we continue to stuff that the compiler does. Aside from that, why not simply set a compiler flag that disables optimization and compare? – Aziuth Mar 04 '19 at 14:55
  • I will close the question, i found the culprit, as you all said, it probably wasnt the compilers fault or the fault of the code directly, i just realised, that the tree i was creating was exceeding sizes of 40 GB, which obviously dont run on my 16GB machine, runs fine on my university server wih 128GB, but thank you all for the help – Clebo Sevic Mar 04 '19 at 15:04

1 Answers1

1

It does make a difference if the first block happens to alter v:

  • First version: ExTree<int> build from the modified v:

    {
        //...
        treeOpt.evaluate();
    }
    ExTree<int> treeNai = build_naive(v);
    
  • Second version: ExTree<int> build from the original v:

    ExTree<int> treeOpt = build_opt(v);
    {
        //...
        treeOpt.evaluate();
    }
    

If v is untouched and your program is const-correct, the compiler is free to reorder things anyway.

YSC
  • 38,212
  • 9
  • 96
  • 149
  • `v` is not modified by either of those functions. I get what you mean, but if the compiler reorders anyways, why does the second version cause more frequent errors? – Clebo Sevic Mar 04 '19 at 13:38
  • 1
    @CleboSevic Without a [mcve], we cannot tell. This is your answer: if the two programs differ, either `v` is modified or [undefined behavior](https://stackoverflow.com/a/4105123/5470596) lies around. – YSC Mar 04 '19 at 13:40
  • 2
    @CleboSevic: Even if the elided code doesn't actually modify `v`, if the compiler can't prove that it doesn't, then it must assume that it does. That in turn may change what it could do in terms of re-ordering and optimizing, as @YSC says. But we really can't say for sure with the limited information you've presented. – metal Mar 04 '19 at 13:45