Questions on STM implementations

Question

I've read two quite different accounts of how STM is implemented. Perhaps both are valid, or one is mistaken, and I hoping someone can shed light.

Take 1 (wikipedia): all threads are allowed to modify shared memory, but every read and write within the transaction is logged. At the end of the transaction the system checks that other threads have not concurrently made changes to memory. If no changes were made, transaction is committed. Otherwise, transaction is restarted.

Q: If this is a valid implmentation, then it seems useless to allow two threads to execute within the same transaction. They will read each other's writes and the computation within the transaction block will be malformed.

Take 2 (can't find source): When a program enters transaction, it gets a copy of all variables contained within and can go to town on them. At the end of the transaction the system tries to update the master with the copy. If the master hasn't changed since the copy was first made, then the transaction is committed. Otherwise, transaction is restarted.

Also, what happens when two threads enter the same transaction at the same time? Doesn't this cause a race condition of sorts? Since both are attempting to modify the same shared memory, both will need to be restarted and the issue continue ad infinitum unless the system steps in and tell one thread to cool it (sort of like a lock =) I'm sure I'm missing a concept here.

Gene · Accepted Answer · 2012-06-13T20:55:12.623

I'm not an expert but have read a few papers. As is common with proposals for new technologies, it looks like one group proposed STM, but subsequent work by other groups looked at variations. There are more than the two you mention. For example, this paper gives a mechanism that has blocking locks on transactions rather than the aggressive non-blocking approach in both of yours: http://pages.cs.wisc.edu/~david/courses/cs758/Fall2007/papers/saha-mcrt-ppopp-2007.pdf The only common thread among implementation techniques seems to be the database-like transaction semantics. So the central question of the research is about whether these semantics lead to better programs and/or more efficiency in general. How they're implemented is up for grabs.

Also, what happens when two threads enter the same transaction at the same time? Doesn't this cause a race condition of sorts? Since both are attempting to modify the same shared memory, both will need to be restarted...

Not really. The system always makes progress because when 2 or more commits are in conflict, the log allows all to be rolled back to the point where the conflicting transactions started. Then the threads can be allowed to proceed and commit sequentially. You're right this causes duplicate work, especially when a single object is in high demand. However avoiding this would be worthwhile only when its more costly than a context swap. The STM folks are betting that memory conflicts are comparatively rare. The Wikipedia scheme is particularly aggressive in this assumption.

thanks for clearing things up with re: multiple implimentations. On the 2-thread question, I'm still confused. If two threads are making changes to shared memory, how can you allow one to commit? Thread 1 cannot trust the state of the shared memory because Thread 2 could have written to it. Even if Thread2 rollsback, Thread1 would need a fresh start with the memory as it was before the transaction begun. — Suraj, Jun 13 '12 at 18:39
Now I see what you are driving at. In papers I've read, writes are made in local copies until commit time. You're right that in this WP scheme all threads share the log and when a commit conflict occurs it must be unwound to the point where all conflicting transactions start. Then only one of those threads can be restarted and must commit before the others can proceed (as in breaking deadlock in a DB). Note: your words "enter same transaction" are a bit off. Transactions are unique per thread. It's only shared memory locations causing conflicts. — Gene, Jun 13 '12 at 20:48

score 1 · Answer 2 · edited May 23 '17 at 12:30

Here is a link to an answer which outlines the TL2 STM algorithm and which provides links to a couple of fairly readable papers on implementing it and how to mechanically transform normal code into STM code:

Stackoverflow: How do you implement Software Transactional Memory?

Clearly it is an active research area and so there are lots of approaches. The one that may work best for a given problem depends on the concurrency and read/write ratio of the problem at hand.

With respect to the TL2 algorithm your question:

... it seems useless to allow two threads to execute within the same transaction. They will read each other's writes and the computation within the transaction block will be malformed.

Its about what data is seen (the read-set) or updated (the write-set) during transactions not what code they are executing. The TL2 algorithm allows speculative work to happen with no locks held keeping the write-set private to each thread until commit. At commit time the transaction takes write locks then does a final versions check that the read-set versions still match the master value versions (the public committed values) before it flushes the updates to the master values incrementing their versions. Of course the transaction can see its own private updates before commit and any committed master values but different tx dont see each others unflushed write-set. If some other transaction has in the meanwhile committed and changed the master versions from what was seen in the read-set the tx is aborted. This enforces consistency without holding of locks whilst the application logic is running. The papers at the link above have a lot more details and optimisations.

Also, what happens when two threads enter the same transaction at the same time? Doesn't this cause a race condition of sorts?

Yes threads race with TL2 when doing different work if they are reading values which the other thread will update. Work can be thrown away due to aborting at the final consistency checks. You should code to retry a number of times when this happens until you get a consistent read which finally leads to a successful commit. The hope is that if you have many cores on your server you will get more total throughput by occasionally throwing the work of a core away than you would have by using course grained locks which block cores. The other hope is that STM can be more straight forward than hand crafting an optimal fine grained locking approach that avoids deadlock. Clearly throughput depends on the actual application logic and collisions.

Since both are attempting to modify the same shared memory, both will need to be restarted and the issue continue ad infinitum unless the system steps in and tell one thread to cool it (sort of like a lock =)

The livelock you mention is avoided by taking locks on the master values at commit time. Take all the locks in ascending memory order of the master values memory addresses and you will not get deadlock. One of the two threads will get all the locks it needs first. The other will block. The first thread will then do its read-set consistency checks, flush it's write-set incrementing the master values, then release the locks. The second thread will finally get all it's locks, then do its read-set consistency checks, and abort the transaction due to seeing that thread one has incremented the version of a master value it used in its application logic.

Clearly this commit time locking strategy blocks threads but locks are not held whilst arbitrary application logic is running. This critical section can be aggresively tuned. The papers even make mention of processor instructions which request that a thread is left to run uninterrupted whilst it is holding those locks.

Questions on STM implementations

2 Answers2