12

Based on the understanding from the following:

Where is allocated variable reference, in stack or in the heap?

I was wondering since all the objects are created on the common heap. If multiple threads create objects then to prevent data corruption there has to be some serialization that must be happening to prevent the multiple threads from creating objects at same locations. Now, with a large number of threads this serialization would cause a big bottleneck. How does Java avoid this bottleneck? Or am I missing something?

Any help appreciated.

Community
  • 1
  • 1
chiku
  • 258
  • 1
  • 10

3 Answers3

13

Modern VM implementations reserve for each thread an own area on the heap to create objects in. So, no problem as long as this area does not get full (then the garbage collector moves the surviving objects).

Further read: how TLAB works in Sun's JVM. Azul's VM uses slightly different approach (look at "A new thread & stack layout"), the article shows quite a few tricks JVMs may perform behind the scenes to ensure nowadays Java speed.

The main idea is keeping per thread (non-shared) area to allocate new objects, much like allocating on the stack with C/C++. The copy garbage collection is very quick to deallocate the short-lived objects, the few survivors, if any, are moved into different area. Thus, creating relatively small objects is very fast and lock free.

The lock free allocation is very important, especially since the question regards multithreaded environment. It also allows true lock-free algorithms to exist. Even if an algorithm, itself, is a lock-free but allocation of new objects is synchronized, the entire algorithm is effectively synchronized and ultimately less scalable. java.util.concurrent.ConcurrentLinkedQueue which is based on the work of Maged M. Michael Michael L. Scott is a classic example.


What happens if an object is referenced by another thread? (due to discussion request)

That object (call it A) will be moved to some "survivor" area. The survivor area is checked less often than the ThreadLocal areas. It contains, like the name suggests, objects whose references managed to escape, or in particular A managed to stay alive. The copy (move) part occurs during some "safe point" (safe point excludes properly JIT'd code), so the garbage collector is sure the object is not being referenced. The references to the object are updated, the necessary memories fences issued and the application (java code) is free to continue. Further read to this simplistic scenario.

To the very interested reader and if possible to chew it: the highly advanced Pauseless GC Algorithm

Bill the Lizard
  • 398,270
  • 210
  • 566
  • 880
Paŭlo Ebermann
  • 73,284
  • 20
  • 146
  • 210
  • Thanks for your answer. Is it always the case? Say I have a system with 10 threads 5 producers and 5 consumers. Producers produce data(Object creation) and put the objects into some Queue. Consumers consume the data produced(say they are persisting the records in the file and for the system to be efficient they are doing batch updates). Now, producers are only creating new Objects and consumers are creating none. Won't the system of having separate areas in heap increase the chances of heap filling up earlier? Also, would there be any re-sizing stuff? Will it not lead to more fragmentation? – chiku Feb 20 '11 at 01:18
  • Also, I can create the threads dynamically say, by taking an input from the user. Now, how does JVM handle this case? Also, if a Thread dies then what happens? – chiku Feb 20 '11 at 01:37
  • The problem is, I don't really know the details of the workings of the VM, this is only what I heard somewhere ... I'll mark my answer as *community wiki*, so others can add details. (I'm interested, too, but not enough to do proper research now.) You may look up on "generational garbage collector" - in this case, it is the "Eden generation", I think, which is local for each Thread. (Report back here.) – Paŭlo Ebermann Feb 20 '11 at 01:54
  • Thanks a lot! I will read up and report back my findings here. – chiku Feb 20 '11 at 02:40
  • @chiku, the object created by T1 will be copied/moved to the "survivor" area by the GC, its reference (pointer) will be updated accordingly. That's the basic idea of the stop and copy type of garbage collection. However I believe you will have even more questions. Now you'd ask how exactly a concurrent garbage collector works. The concurrent VM (and GC) of Java (hotspot for instance) is one of the most sophisticated multi-threaded applications out there. I'd encourage you to ask a separate question on the topic. Here, the original question is just too different. – bestsss Feb 27 '11 at 20:23
  • @chiku, added some more info on the GC but I refuse to update the answer any more w/ GC related stuff. – bestsss Feb 27 '11 at 20:45
1

No. The JVM has all sorts of tricks up its sleeves to avoid any sort of simpleminded serialization at the point of 'new'.

bmargulies
  • 97,814
  • 39
  • 186
  • 310
  • Thanks for your answer. I understand that simpleminded serialization would be a too naive solution. What I would like to understand is what sort of tricks does the JVM use to address this problem? – chiku Feb 20 '11 at 01:20
  • @chiku, added some info on the tricks part in 'Paŭlo Ebermann' answers. – bestsss Feb 27 '11 at 10:50
  • @bestsss, Thanks for adding to the information. I have not been able to get back and find an answer myself. Should be doing so in sometime. Your edit answers most of the questions I had. However, one question that still remains unanswered in my opinion is: How do the threads handle the escape of the objects. Objects created can escape via a number of mechanisms like adding it to a shared list or through returning objects.. Any ideas on that too? – chiku Feb 27 '11 at 19:11
  • @chiku, this is what the garbage collection does. – bestsss Feb 27 '11 at 19:18
  • @bestsss, I am sorry to be sounding so dumb. My question is: Suppose I have two threads T1 and T2. T1 calls a function which has an interface like func(List list). T1 creates new objects in the function and adds it to the list. Now, the new objects created by the T1 are allocated on the TLAB of T1. Consider that T2 can access the list object(probably because it is static). Now, the object created by the T1 cannot be garbage collected because List has a reference to that object and T2 has reference to the list object. How does GC help? Did I miss out on any anything too obvious? Thanks! – chiku Feb 27 '11 at 19:45
  • @Chiku, answered under the other answer. – bestsss Feb 27 '11 at 20:19
1

Sometimes. I wrote a recursive method that generates integer permutations and creates objects from those. The multithreaded version (every branch from root = task, but concurrent thread count limited to number of cores) of that method wasn't faster. And the CPU load wasn't higher. The tasks didn't share any object. After I removed the object creation from both methods the multithreaded method was ~4x faster (6 cores) and used 100% CPU. In my test case the methods generated ~4,500,000 permutations, 1500 per task. I think TLAB didn't work because it's space is limited (see: Thread Local Allocation Buffers).

Community
  • 1
  • 1
squall
  • 11
  • 2