2

Referring to this construct, posting a full example would be a little too big:

__thread char* buf;
buf = malloc(1000);

Valgrind says the bytes are "definitely" lost. Shouldn't they just be "still reachable"?

Blub
  • 13,014
  • 18
  • 75
  • 102

5 Answers5

9

Because the allocated memory is not thread-local. It's shared by all of the threads.

The variable is, on the other hand, thread local so once it's out of scope that allocated memory will be definitely lost (if there are no copies of that pointer elsewhere.. and obviously there aren't because valgrind reports definitely lost)

You have to free it.

Karoly Horvath
  • 94,607
  • 11
  • 117
  • 176
  • Uhm yeah, it is thread local. That's the whole point of __thread. – Blub Oct 26 '11 at 11:42
  • True. Onlys stacks are thread local, not heap. There is no reason why heap should be thread local. – mouviciel Oct 26 '11 at 11:57
  • @mouviciel do you have any supporting resources for the claim that heap is never thread local? I searched, but other than your comment, there is nothing to indicate that you are right. – Blub Oct 26 '11 at 11:58
  • This is just common sense. First, developers are lazy. A global heap aleady exists and `malloc()` uses it. When threads were introduced, it was easy to use that existing feature. Second, implementing one heap per thread means more RAM constraints and possibly swapping at thread level instead of process level. Why not? But what problem this feature would solve? Allowing dangling `malloc()`? It would be easier to implement a garbage collector. – mouviciel Oct 26 '11 at 12:09
  • @Blub: Apart from what mouviciel says, [wikipedia](http://en.wikipedia.org/wiki/Thread-local_storage) says: "normally all threads in a process share the same address space". Also see http://stackoverflow.com/questions/1665419/do-threads-share-the-heap and observe that allocators that do use thread-local heaps are explicitly described as such (implying that it's not the default). Also, if there was no thread-global heap, you'd need a special construct to share *anything* between threads - there clearly isn't an explicit one, and an implicit one seems impossible. –  Oct 26 '11 at 12:18
  • it's the same address space, and it's a common way to share data between threads. It would be *very* inconvenient to have separate heaps by default. I guess you could use separate heap and allocator for specific threads if that makes sense for you... – Karoly Horvath Oct 26 '11 at 13:42
  • 1
    To the ignorant folks who think putting `__thread` on a pointer variable makes the `malloc`-obtained block whose address you store in it somehow thread-local... Does putting `auto` on a pointer variable make the `malloc`-obtained block whose address you store in it automatic (freed as soon as the variable goes out of scope)? Oh by the way, all local vars are `auto` by default... – R.. GitHub STOP HELPING ICE Oct 26 '11 at 14:04
  • Or perhaps a better analogy.. suppose you store a customer's physical (street) address in a thread-local string variable. Would you expect the compiler to send out a bulldozers to tear down the customer's house and make that space available for reuse once your thread terminates? :-) – R.. GitHub STOP HELPING ICE Oct 26 '11 at 14:13
2

You need to explicitly deallocate it by calling free.

Heap allocated memory allocated by malloc is not reclaimed until explicitly freed by calling free. Only stack allocated local storage objects are automatically deallocated when an thread ends.

This is definitely lost because you don't have any pointer to the allocated memory once the thread exits, The pointer which points to the memory is local to the stack of the thread and it gets destroyed when thread exits, but the allocated memory is heap memory and it does not get de-allocated.

Alok Save
  • 202,538
  • 53
  • 430
  • 533
2

If the only pointer to the block is thread local, then by exiting the thread, you have lost the only pointer.

That means that it is no longer reachable = definitely lost.

Šimon Tóth
  • 35,456
  • 20
  • 106
  • 151
2

Well, as other have said, you have to free it.

The reasoning behind it is this: all threads share a common heap, and conceptually, memory 'ownership' can be passed between threads. One thread can malloc something, and another can free it. But, the heap has no idea who 'owns' the memory, so when your thread terminates (even if the heap remembered which thread malloc'd what) it couldn't safely delete it.

But, when your process terminates, all the heap memory is effectively 'freed' - but not individually: The entire heap of your process (which was probably just one big lump) is returned to the operating system.

Roddy
  • 66,617
  • 42
  • 165
  • 277
  • Well that's just the thing, in case of thread-local storage the thread *could* safely delete it, because the memory is *not* shared with any other threads. (at least not logically shared, it doesn't really matter that another thread could by accident still access the memory, by method of array overflow for example) – Blub Oct 26 '11 at 12:03
  • 1
    @Blub : Only the *pointer* is thread local. What it actually points to isn't. `malloc` has no way of knowing that you're going to assign it's return to a thread-local pointer. – Roddy Oct 26 '11 at 12:19
  • @Blub: In general, determining that the memory is not accessible by other threads is equivalent to the halting problem. So what you're proposing is that the memory would sometimes get freed, and sometimes not, based on whether the special case of the halting problem were solvable by your compiler. Now considering that double-free invokes very dangerous undefined behavior and you have no way of knowing whether it will get freed or not automatically, this sounds like a recipe for disaster! – R.. GitHub STOP HELPING ICE Oct 26 '11 at 14:09
0

This is a little like the "taste great" / "less filling" argument. Valgrind is correct AND the data is "still reachable". For instance if the data contained passwords you could 100% extract them from a heap scan. If the data started with a unique random number you could relocate it. Valgrind means you can no longer access the data through the pointer.